Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization

Eldad Meller; Alexander Finkelstein; Uri Almog; Mark Grobman

Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization

Eldad Meller, Alexander Finkelstein, Uri Almog, Mark Grobman

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:4486-4495, 2019.

Abstract

Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a given layer, individual output channels can be scaled by any factor provided that the corresponding weights of the next layer are inversely scaled. Therefore, a given network has many factorizations which change the weights of the network without changing its function. We present a conceptually simple and easy to implement method that uses this property and show that proper factorizations significantly decrease the degradation caused by quantization. We show improvement on a wide variety of networks and achieve state-of-the-art degradation results for MobileNets. While our focus is on quantization, this type of factorization is applicable to other domains such as network-pruning, neural nets regularization and network interpretability.

Cite this Paper

BibTeX


@InProceedings{pmlr-v97-meller19a,
  title = 	 {Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization},
  author =       {Meller, Eldad and Finkelstein, Alexander and Almog, Uri and Grobman, Mark},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {4486--4495},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/meller19a/meller19a.pdf},
  url = 	 {https://proceedings.mlr.press/v97/meller19a.html},
  abstract = 	 {Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a given layer, individual output channels can be scaled by any factor provided that the corresponding weights of the next layer are inversely scaled. Therefore, a given network has many factorizations which change the weights of the network without changing its function. We present a conceptually simple and easy to implement method that uses this property and show that proper factorizations significantly decrease the degradation caused by quantization. We show improvement on a wide variety of networks and achieve state-of-the-art degradation results for MobileNets. While our focus is on quantization, this type of factorization is applicable to other domains such as network-pruning, neural nets regularization and network interpretability.}
}

Endnote

%0 Conference Paper
%T Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization
%A Eldad Meller
%A Alexander Finkelstein
%A Uri Almog
%A Mark Grobman
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-meller19a
%I PMLR
%P 4486--4495
%U https://proceedings.mlr.press/v97/meller19a.html
%V 97
%X Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks - for a given layer, individual output channels can be scaled by any factor provided that the corresponding weights of the next layer are inversely scaled. Therefore, a given network has many factorizations which change the weights of the network without changing its function. We present a conceptually simple and easy to implement method that uses this property and show that proper factorizations significantly decrease the degradation caused by quantization. We show improvement on a wide variety of networks and achieve state-of-the-art degradation results for MobileNets. While our focus is on quantization, this type of factorization is applicable to other domains such as network-pruning, neural nets regularization and network interpretability.

APA


Meller, E., Finkelstein, A., Almog, U. & Grobman, M.. (2019). Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:4486-4495 Available from https://proceedings.mlr.press/v97/meller19a.html.

Same, Same But Different: Recovering Neural Network Quantization Error Through Weight Factorization

Abstract

Cite this Paper

Related Material