Get rid of your constraints and reparametrize: A study in NNLS and implicit bias

Hung-Hsu Chou, Johannes Maly, Claudio Mayrink Verdun, Bernardo Freitas Paulo da Costa, Heudson Mirandola
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:829-837, 2025.

Abstract

Over the past years, there has been significant interest in understanding the implicit bias of gradient descent optimization and its connection to the generalization properties of overparametrized neural networks. Several works observed that when training linear diagonal networks on the square loss for regression tasks (which corresponds to overparametrized linear regression) gradient descent converges to special solutions, e.g., non-negative ones. We connect this observation to Riemannian optimization and view overparametrized GD with identical initialization as a Riemannian GD. We use this fact for solving non-negative least squares (NNLS), an important problem behind many techniques, e.g., non-negative matrix factorization. We show that gradient flow on the reparametrized objective converges globally to NNLS solutions, providing convergence rates also for its discretized counterpart. Unlike previous methods, we do not rely on the calculation of exponential maps or geodesics. We further show accelerated convergence using a second-order ODE, lending itself to accelerated descent methods. Finally, we establish the stability against negative perturbations and discuss generalization to other constrained optimization problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-chou25a, title = {Get rid of your constraints and reparametrize: A study in NNLS and implicit bias}, author = {Chou, Hung-Hsu and Maly, Johannes and Verdun, Claudio Mayrink and da Costa, Bernardo Freitas Paulo and Mirandola, Heudson}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {829--837}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/chou25a/chou25a.pdf}, url = {https://proceedings.mlr.press/v258/chou25a.html}, abstract = {Over the past years, there has been significant interest in understanding the implicit bias of gradient descent optimization and its connection to the generalization properties of overparametrized neural networks. Several works observed that when training linear diagonal networks on the square loss for regression tasks (which corresponds to overparametrized linear regression) gradient descent converges to special solutions, e.g., non-negative ones. We connect this observation to Riemannian optimization and view overparametrized GD with identical initialization as a Riemannian GD. We use this fact for solving non-negative least squares (NNLS), an important problem behind many techniques, e.g., non-negative matrix factorization. We show that gradient flow on the reparametrized objective converges globally to NNLS solutions, providing convergence rates also for its discretized counterpart. Unlike previous methods, we do not rely on the calculation of exponential maps or geodesics. We further show accelerated convergence using a second-order ODE, lending itself to accelerated descent methods. Finally, we establish the stability against negative perturbations and discuss generalization to other constrained optimization problems.} }
Endnote
%0 Conference Paper %T Get rid of your constraints and reparametrize: A study in NNLS and implicit bias %A Hung-Hsu Chou %A Johannes Maly %A Claudio Mayrink Verdun %A Bernardo Freitas Paulo da Costa %A Heudson Mirandola %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-chou25a %I PMLR %P 829--837 %U https://proceedings.mlr.press/v258/chou25a.html %V 258 %X Over the past years, there has been significant interest in understanding the implicit bias of gradient descent optimization and its connection to the generalization properties of overparametrized neural networks. Several works observed that when training linear diagonal networks on the square loss for regression tasks (which corresponds to overparametrized linear regression) gradient descent converges to special solutions, e.g., non-negative ones. We connect this observation to Riemannian optimization and view overparametrized GD with identical initialization as a Riemannian GD. We use this fact for solving non-negative least squares (NNLS), an important problem behind many techniques, e.g., non-negative matrix factorization. We show that gradient flow on the reparametrized objective converges globally to NNLS solutions, providing convergence rates also for its discretized counterpart. Unlike previous methods, we do not rely on the calculation of exponential maps or geodesics. We further show accelerated convergence using a second-order ODE, lending itself to accelerated descent methods. Finally, we establish the stability against negative perturbations and discuss generalization to other constrained optimization problems.
APA
Chou, H., Maly, J., Verdun, C.M., da Costa, B.F.P. & Mirandola, H.. (2025). Get rid of your constraints and reparametrize: A study in NNLS and implicit bias. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:829-837 Available from https://proceedings.mlr.press/v258/chou25a.html.

Related Material