StrassenNets: Deep Learning with a Multiplication Budget

Michael Tschannen; Aran Khanna; Animashree Anandkumar

StrassenNets: Deep Learning with a Multiplication Budget

Michael Tschannen, Aran Khanna, Animashree Anandkumar

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:4985-4994, 2018.

Abstract

A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen’s matrix multiplication algorithm, learning to multiply $2 \times 2$ matrices using only 7 multiplications instead of 8.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-tschannen18a,
  title = 	 {{S}trassen{N}ets: Deep Learning with a Multiplication Budget},
  author =       {Tschannen, Michael and Khanna, Aran and Anandkumar, Animashree},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {4985--4994},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/tschannen18a/tschannen18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/tschannen18a.html},
  abstract = 	 {A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen’s matrix multiplication algorithm, learning to multiply $2 \times 2$ matrices using only 7 multiplications instead of 8.}
}

Endnote

%0 Conference Paper
%T StrassenNets: Deep Learning with a Multiplication Budget
%A Michael Tschannen
%A Aran Khanna
%A Animashree Anandkumar
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-tschannen18a
%I PMLR
%P 4985--4994
%U https://proceedings.mlr.press/v80/tschannen18a.html
%V 80
%X A large fraction of the arithmetic operations required to evaluate deep neural networks (DNNs) consists of matrix multiplications, in both convolution and fully connected layers. We perform end-to-end learning of low-cost approximations of matrix multiplications in DNN layers by casting matrix multiplications as 2-layer sum-product networks (SPNs) (arithmetic circuits) and learning their (ternary) edge weights from data. The SPNs disentangle multiplication and addition operations and enable us to impose a budget on the number of multiplication operations. Combining our method with knowledge distillation and applying it to image classification DNNs (trained on ImageNet) and language modeling DNNs (using LSTMs), we obtain a first-of-a-kind reduction in number of multiplications (over 99.5%) while maintaining the predictive performance of the full-precision models. Finally, we demonstrate that the proposed framework is able to rediscover Strassen’s matrix multiplication algorithm, learning to multiply $2 \times 2$ matrices using only 7 multiplications instead of 8.

APA


Tschannen, M., Khanna, A. & Anandkumar, A.. (2018). StrassenNets: Deep Learning with a Multiplication Budget. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:4985-4994 Available from https://proceedings.mlr.press/v80/tschannen18a.html.

StrassenNets: Deep Learning with a Multiplication Budget

Abstract

Cite this Paper

Related Material