DFQF: Data Free Quantization-aware Fine-tuning

Bowen Li, Kai Huang, Siang Chen, Dongliang Xiong, Haitian Jiang, Luc Claesen
Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:289-304, 2020.

Abstract

Data free deep neural network quantization is a practical challenge, since the original training data is often unavailable due to some privacy, proprietary or transmission issues. The existing methods implicitly equate data-free with training-free and quantize model manually through analyzing the weights’ distribution. It leads to a significant accuracy drop in lower than 6-bit quantization. In this work, we propose the data free quantization-aware fine-tuning (DFQF), wherein no real training data is required, and the quantized network is fine-tuned with generated images. Specifically, we start with training a generator from the pre-trained full-precision network with inception score loss, batch-normalization statistics loss and adversarial loss to synthesize a fake image set. Then we fine-tune the quantized student network with the full-precision teacher network and the generated images by utilizing knowledge distillation (KD). The proposed DFQF outperforms state-of-the-art post-train quantization methods, and achieve W4A4 quantization of ResNet20 on the CIFAR10 dataset within 1% accuracy drop.

Cite this Paper


BibTeX
@InProceedings{pmlr-v129-li20a, title = {DFQF: Data Free Quantization-aware Fine-tuning}, author = {Li, Bowen and Huang, Kai and Chen, Siang and Xiong, Dongliang and Jiang, Haitian and Claesen, Luc}, booktitle = {Proceedings of The 12th Asian Conference on Machine Learning}, pages = {289--304}, year = {2020}, editor = {Pan, Sinno Jialin and Sugiyama, Masashi}, volume = {129}, series = {Proceedings of Machine Learning Research}, month = {18--20 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v129/li20a/li20a.pdf}, url = {https://proceedings.mlr.press/v129/li20a.html}, abstract = {Data free deep neural network quantization is a practical challenge, since the original training data is often unavailable due to some privacy, proprietary or transmission issues. The existing methods implicitly equate data-free with training-free and quantize model manually through analyzing the weights’ distribution. It leads to a significant accuracy drop in lower than 6-bit quantization. In this work, we propose the data free quantization-aware fine-tuning (DFQF), wherein no real training data is required, and the quantized network is fine-tuned with generated images. Specifically, we start with training a generator from the pre-trained full-precision network with inception score loss, batch-normalization statistics loss and adversarial loss to synthesize a fake image set. Then we fine-tune the quantized student network with the full-precision teacher network and the generated images by utilizing knowledge distillation (KD). The proposed DFQF outperforms state-of-the-art post-train quantization methods, and achieve W4A4 quantization of ResNet20 on the CIFAR10 dataset within 1% accuracy drop.} }
Endnote
%0 Conference Paper %T DFQF: Data Free Quantization-aware Fine-tuning %A Bowen Li %A Kai Huang %A Siang Chen %A Dongliang Xiong %A Haitian Jiang %A Luc Claesen %B Proceedings of The 12th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Sinno Jialin Pan %E Masashi Sugiyama %F pmlr-v129-li20a %I PMLR %P 289--304 %U https://proceedings.mlr.press/v129/li20a.html %V 129 %X Data free deep neural network quantization is a practical challenge, since the original training data is often unavailable due to some privacy, proprietary or transmission issues. The existing methods implicitly equate data-free with training-free and quantize model manually through analyzing the weights’ distribution. It leads to a significant accuracy drop in lower than 6-bit quantization. In this work, we propose the data free quantization-aware fine-tuning (DFQF), wherein no real training data is required, and the quantized network is fine-tuned with generated images. Specifically, we start with training a generator from the pre-trained full-precision network with inception score loss, batch-normalization statistics loss and adversarial loss to synthesize a fake image set. Then we fine-tune the quantized student network with the full-precision teacher network and the generated images by utilizing knowledge distillation (KD). The proposed DFQF outperforms state-of-the-art post-train quantization methods, and achieve W4A4 quantization of ResNet20 on the CIFAR10 dataset within 1% accuracy drop.
APA
Li, B., Huang, K., Chen, S., Xiong, D., Jiang, H. & Claesen, L.. (2020). DFQF: Data Free Quantization-aware Fine-tuning. Proceedings of The 12th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 129:289-304 Available from https://proceedings.mlr.press/v129/li20a.html.

Related Material