Fixed Point Quantization of Deep Convolutional Networks

Darryl Lin; Sachin Talathi; Sreekanth Annapureddy

Fixed Point Quantization of Deep Convolutional Networks

Darryl Lin, Sachin Talathi, Sreekanth Annapureddy

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2849-2858, 2016.

Abstract

In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-linb16,
  title = 	 {Fixed Point Quantization of Deep Convolutional Networks},
  author = 	 {Lin, Darryl and Talathi, Sachin and Annapureddy, Sreekanth},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {2849--2858},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/linb16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/linb16.html},
  abstract = 	 {In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.}
}

Endnote

%0 Conference Paper
%T Fixed Point Quantization of Deep Convolutional Networks
%A Darryl Lin
%A Sachin Talathi
%A Sreekanth Annapureddy
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-linb16
%I PMLR
%P 2849--2858
%U https://proceedings.mlr.press/v48/linb16.html
%V 48
%X In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.

RIS


TY  - CPAPER
TI  - Fixed Point Quantization of Deep Convolutional Networks
AU  - Darryl Lin
AU  - Sachin Talathi
AU  - Sreekanth Annapureddy
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-linb16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 2849
EP  - 2858
L1  - http://proceedings.mlr.press/v48/linb16.pdf
UR  - https://proceedings.mlr.press/v48/linb16.html
AB  - In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we propose a quantizer design for fixed point implementation of DCNs. We formulate and solve an optimization problem to identify optimal fixed point bit-width allocation across DCN layers. Our experiments show that in comparison to equal bit-width settings, the fixed point DCNs with optimized bit width allocation offer >20% reduction in the model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.
ER  -

APA


Lin, D., Talathi, S. & Annapureddy, S.. (2016). Fixed Point Quantization of Deep Convolutional Networks. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2849-2858 Available from https://proceedings.mlr.press/v48/linb16.html.

Related Material

Download PDF