Winograd Algorithm for AdderNet

Wenshuo Li; Hanting Chen; Mingqiang Huang; Xinghao Chen; Chunjing Xu; Yunhe Wang

Winograd Algorithm for AdderNet

Wenshuo Li, Hanting Chen, Mingqiang Huang, Xinghao Chen, Chunjing Xu, Yunhe Wang

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:6307-6315, 2021.

Abstract

Adder neural network (AdderNet) is a new kind of deep model that replaces the original massive multiplications in convolutions by additions while preserving the high performance. Since the hardware complexity of additions is much lower than that of multiplications, the overall energy consumption is thus reduced significantly. To further optimize the hardware overhead of using AdderNet, this paper studies the winograd algorithm, which is a widely used fast algorithm for accelerating convolution and saving the computational costs. Unfortunately, the conventional Winograd algorithm cannot be directly applied to AdderNets since the distributive law in multiplication is not valid for the l1-norm. Therefore, we replace the element-wise multiplication in the Winograd equation by additions and then develop a new set of transform matrixes that can enhance the representation ability of output features to maintain the performance. Moreover, we propose the l2-to-l1 training strategy to mitigate the negative impacts caused by formal inconsistency. Experimental results on both FPGA and benchmarks show that the new method can further reduce the energy consumption without affecting the accuracy of the original AdderNet.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-li21c,
  title = 	 {Winograd Algorithm for AdderNet},
  author =       {Li, Wenshuo and Chen, Hanting and Huang, Mingqiang and Chen, Xinghao and Xu, Chunjing and Wang, Yunhe},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {6307--6315},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/li21c/li21c.pdf},
  url = 	 {https://proceedings.mlr.press/v139/li21c.html},
  abstract = 	 {Adder neural network (AdderNet) is a new kind of deep model that replaces the original massive multiplications in convolutions by additions while preserving the high performance. Since the hardware complexity of additions is much lower than that of multiplications, the overall energy consumption is thus reduced significantly. To further optimize the hardware overhead of using AdderNet, this paper studies the winograd algorithm, which is a widely used fast algorithm for accelerating convolution and saving the computational costs. Unfortunately, the conventional Winograd algorithm cannot be directly applied to AdderNets since the distributive law in multiplication is not valid for the l1-norm. Therefore, we replace the element-wise multiplication in the Winograd equation by additions and then develop a new set of transform matrixes that can enhance the representation ability of output features to maintain the performance. Moreover, we propose the l2-to-l1 training strategy to mitigate the negative impacts caused by formal inconsistency. Experimental results on both FPGA and benchmarks show that the new method can further reduce the energy consumption without affecting the accuracy of the original AdderNet.}
}

Endnote

%0 Conference Paper
%T Winograd Algorithm for AdderNet
%A Wenshuo Li
%A Hanting Chen
%A Mingqiang Huang
%A Xinghao Chen
%A Chunjing Xu
%A Yunhe Wang
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-li21c
%I PMLR
%P 6307--6315
%U https://proceedings.mlr.press/v139/li21c.html
%V 139
%X Adder neural network (AdderNet) is a new kind of deep model that replaces the original massive multiplications in convolutions by additions while preserving the high performance. Since the hardware complexity of additions is much lower than that of multiplications, the overall energy consumption is thus reduced significantly. To further optimize the hardware overhead of using AdderNet, this paper studies the winograd algorithm, which is a widely used fast algorithm for accelerating convolution and saving the computational costs. Unfortunately, the conventional Winograd algorithm cannot be directly applied to AdderNets since the distributive law in multiplication is not valid for the l1-norm. Therefore, we replace the element-wise multiplication in the Winograd equation by additions and then develop a new set of transform matrixes that can enhance the representation ability of output features to maintain the performance. Moreover, we propose the l2-to-l1 training strategy to mitigate the negative impacts caused by formal inconsistency. Experimental results on both FPGA and benchmarks show that the new method can further reduce the energy consumption without affecting the accuracy of the original AdderNet.

APA

Li, W., Chen, H., Huang, M., Chen, X., Xu, C. & Wang, Y.. (2021). Winograd Algorithm for AdderNet. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:6307-6315 Available from https://proceedings.mlr.press/v139/li21c.html.

Related Material

Download PDF