DPOQ: Dynamic Precision Onion Quantization

Bowen Li, Kai Huang, Siang Chen, Dongliang Xiong, Luc Claesen
Proceedings of The 13th Asian Conference on Machine Learning, PMLR 157:502-517, 2021.

Abstract

With the development of deployment platforms and application scenarios for deep neural networks, traditional fixed network architectures cannot meet the requirements. Meanwhile the dynamic network inference becomes a new research trend. Many slimmable and scalable networks have been proposed to satisfy different resource constraints (e.g., storage, latency and energy). And a single network may support versatile architectural configurations including: depth, width, kernel size, and resolution. In this paper, we propose a novel network architecture reuse strategy enabling dynamic precision in parameters. Since our low-precision networks are wrapped in the high-precision networks like an onion, we name it dynamic precision onion quantization (DPOQ). We train the network by using the joint loss with scaled gradients. To further improve the performance and make different precision network compatible with each other, we propose the precision shift batch normalization (PSBN). And we also propose a scalable input-specific inference mechanism based on this architecture and make the network more adaptable. Experiments on the CIFAR and ImageNet dataset have shown that our DPOQ achieves not only better flexibility but also higher accuracy than the individual quantization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v157-li21a, title = {DPOQ: Dynamic Precision Onion Quantization}, author = {Li, Bowen and Huang, Kai and Chen, Siang and Xiong, Dongliang and Claesen, Luc}, booktitle = {Proceedings of The 13th Asian Conference on Machine Learning}, pages = {502--517}, year = {2021}, editor = {Balasubramanian, Vineeth N. and Tsang, Ivor}, volume = {157}, series = {Proceedings of Machine Learning Research}, month = {17--19 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v157/li21a/li21a.pdf}, url = {https://proceedings.mlr.press/v157/li21a.html}, abstract = {With the development of deployment platforms and application scenarios for deep neural networks, traditional fixed network architectures cannot meet the requirements. Meanwhile the dynamic network inference becomes a new research trend. Many slimmable and scalable networks have been proposed to satisfy different resource constraints (e.g., storage, latency and energy). And a single network may support versatile architectural configurations including: depth, width, kernel size, and resolution. In this paper, we propose a novel network architecture reuse strategy enabling dynamic precision in parameters. Since our low-precision networks are wrapped in the high-precision networks like an onion, we name it dynamic precision onion quantization (DPOQ). We train the network by using the joint loss with scaled gradients. To further improve the performance and make different precision network compatible with each other, we propose the precision shift batch normalization (PSBN). And we also propose a scalable input-specific inference mechanism based on this architecture and make the network more adaptable. Experiments on the CIFAR and ImageNet dataset have shown that our DPOQ achieves not only better flexibility but also higher accuracy than the individual quantization.} }
Endnote
%0 Conference Paper %T DPOQ: Dynamic Precision Onion Quantization %A Bowen Li %A Kai Huang %A Siang Chen %A Dongliang Xiong %A Luc Claesen %B Proceedings of The 13th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Vineeth N. Balasubramanian %E Ivor Tsang %F pmlr-v157-li21a %I PMLR %P 502--517 %U https://proceedings.mlr.press/v157/li21a.html %V 157 %X With the development of deployment platforms and application scenarios for deep neural networks, traditional fixed network architectures cannot meet the requirements. Meanwhile the dynamic network inference becomes a new research trend. Many slimmable and scalable networks have been proposed to satisfy different resource constraints (e.g., storage, latency and energy). And a single network may support versatile architectural configurations including: depth, width, kernel size, and resolution. In this paper, we propose a novel network architecture reuse strategy enabling dynamic precision in parameters. Since our low-precision networks are wrapped in the high-precision networks like an onion, we name it dynamic precision onion quantization (DPOQ). We train the network by using the joint loss with scaled gradients. To further improve the performance and make different precision network compatible with each other, we propose the precision shift batch normalization (PSBN). And we also propose a scalable input-specific inference mechanism based on this architecture and make the network more adaptable. Experiments on the CIFAR and ImageNet dataset have shown that our DPOQ achieves not only better flexibility but also higher accuracy than the individual quantization.
APA
Li, B., Huang, K., Chen, S., Xiong, D. & Claesen, L.. (2021). DPOQ: Dynamic Precision Onion Quantization. Proceedings of The 13th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 157:502-517 Available from https://proceedings.mlr.press/v157/li21a.html.

Related Material