Sharpness-Aware Data Generation for Zero-shot Quantization

Hoang Anh Dung, Cuong Pham, Trung Le, Jianfei Cai, Thanh-Toan Do
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:12034-12045, 2024.

Abstract

Zero-shot quantization aims to learn a quantized model from a pre-trained full-precision model with no access to original real training data. The common idea in zero-shot quantization approaches is to generate synthetic data for quantizing the full-precision model. While it is well-known that deep neural networks with low sharpness have better generalization ability, none of the previous zero-shot quantization works considers the sharpness of the quantized model as a criterion for generating training data. This paper introduces a novel methodology that takes into account quantized model sharpness in synthetic data generation to enhance generalization. Specifically, we first demonstrate that sharpness minimization can be attained by maximizing gradient matching between the reconstruction loss gradients computed on synthetic and real validation data, under certain assumptions. We then circumvent the problem of the gradient matching without real validation set by approximating it with the gradient matching between each generated sample and its neighbors. Experimental evaluations on CIFAR-100 and ImageNet datasets demonstrate the superiority of the proposed method over the state-of-the-art techniques in low-bit quantization settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-dung24a, title = {Sharpness-Aware Data Generation for Zero-shot Quantization}, author = {Dung, Hoang Anh and Pham, Cuong and Le, Trung and Cai, Jianfei and Do, Thanh-Toan}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {12034--12045}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/dung24a/dung24a.pdf}, url = {https://proceedings.mlr.press/v235/dung24a.html}, abstract = {Zero-shot quantization aims to learn a quantized model from a pre-trained full-precision model with no access to original real training data. The common idea in zero-shot quantization approaches is to generate synthetic data for quantizing the full-precision model. While it is well-known that deep neural networks with low sharpness have better generalization ability, none of the previous zero-shot quantization works considers the sharpness of the quantized model as a criterion for generating training data. This paper introduces a novel methodology that takes into account quantized model sharpness in synthetic data generation to enhance generalization. Specifically, we first demonstrate that sharpness minimization can be attained by maximizing gradient matching between the reconstruction loss gradients computed on synthetic and real validation data, under certain assumptions. We then circumvent the problem of the gradient matching without real validation set by approximating it with the gradient matching between each generated sample and its neighbors. Experimental evaluations on CIFAR-100 and ImageNet datasets demonstrate the superiority of the proposed method over the state-of-the-art techniques in low-bit quantization settings.} }
Endnote
%0 Conference Paper %T Sharpness-Aware Data Generation for Zero-shot Quantization %A Hoang Anh Dung %A Cuong Pham %A Trung Le %A Jianfei Cai %A Thanh-Toan Do %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-dung24a %I PMLR %P 12034--12045 %U https://proceedings.mlr.press/v235/dung24a.html %V 235 %X Zero-shot quantization aims to learn a quantized model from a pre-trained full-precision model with no access to original real training data. The common idea in zero-shot quantization approaches is to generate synthetic data for quantizing the full-precision model. While it is well-known that deep neural networks with low sharpness have better generalization ability, none of the previous zero-shot quantization works considers the sharpness of the quantized model as a criterion for generating training data. This paper introduces a novel methodology that takes into account quantized model sharpness in synthetic data generation to enhance generalization. Specifically, we first demonstrate that sharpness minimization can be attained by maximizing gradient matching between the reconstruction loss gradients computed on synthetic and real validation data, under certain assumptions. We then circumvent the problem of the gradient matching without real validation set by approximating it with the gradient matching between each generated sample and its neighbors. Experimental evaluations on CIFAR-100 and ImageNet datasets demonstrate the superiority of the proposed method over the state-of-the-art techniques in low-bit quantization settings.
APA
Dung, H.A., Pham, C., Le, T., Cai, J. & Do, T.. (2024). Sharpness-Aware Data Generation for Zero-shot Quantization. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:12034-12045 Available from https://proceedings.mlr.press/v235/dung24a.html.

Related Material