PARQ: Piecewise-Affine Regularized Quantization

Lisa Jin, Jianhao Ma, Zechun Liu, Andrey Gromov, Aaron Defazio, Lin Xiao
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:28044-28062, 2025.

Abstract

We develop a novel optimization method for quantization-aware training (QAT). Specifically, we show that convex, piecewise-affine regularization (PAR) can effectively induce neural network weights to cluster towards discrete values. We minimize PAR-regularized loss functions using an aggregate proximal stochastic gradient method (AProx) and prove that it enjoys last-iterate convergence. Our approach provides an interpretation of the straight-through estimator (STE), a widely used heuristic for QAT, as the asymptotic form of PARQ. We conduct experiments to demonstrate that PARQ obtains competitive performance on convolution- and transformer-based vision tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-jin25e, title = {{PARQ}: Piecewise-Affine Regularized Quantization}, author = {Jin, Lisa and Ma, Jianhao and Liu, Zechun and Gromov, Andrey and Defazio, Aaron and Xiao, Lin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {28044--28062}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/jin25e/jin25e.pdf}, url = {https://proceedings.mlr.press/v267/jin25e.html}, abstract = {We develop a novel optimization method for quantization-aware training (QAT). Specifically, we show that convex, piecewise-affine regularization (PAR) can effectively induce neural network weights to cluster towards discrete values. We minimize PAR-regularized loss functions using an aggregate proximal stochastic gradient method (AProx) and prove that it enjoys last-iterate convergence. Our approach provides an interpretation of the straight-through estimator (STE), a widely used heuristic for QAT, as the asymptotic form of PARQ. We conduct experiments to demonstrate that PARQ obtains competitive performance on convolution- and transformer-based vision tasks.} }
Endnote
%0 Conference Paper %T PARQ: Piecewise-Affine Regularized Quantization %A Lisa Jin %A Jianhao Ma %A Zechun Liu %A Andrey Gromov %A Aaron Defazio %A Lin Xiao %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-jin25e %I PMLR %P 28044--28062 %U https://proceedings.mlr.press/v267/jin25e.html %V 267 %X We develop a novel optimization method for quantization-aware training (QAT). Specifically, we show that convex, piecewise-affine regularization (PAR) can effectively induce neural network weights to cluster towards discrete values. We minimize PAR-regularized loss functions using an aggregate proximal stochastic gradient method (AProx) and prove that it enjoys last-iterate convergence. Our approach provides an interpretation of the straight-through estimator (STE), a widely used heuristic for QAT, as the asymptotic form of PARQ. We conduct experiments to demonstrate that PARQ obtains competitive performance on convolution- and transformer-based vision tasks.
APA
Jin, L., Ma, J., Liu, Z., Gromov, A., Defazio, A. & Xiao, L.. (2025). PARQ: Piecewise-Affine Regularized Quantization. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:28044-28062 Available from https://proceedings.mlr.press/v267/jin25e.html.

Related Material