Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

Tianyi Liu, Yan Li, Enlu Zhou, Tuo Zhao
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:2784-2802, 2022.

Abstract

We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix $Y^*\in R^{d\times d}$ from a noisy observation $Y$ using an over-parameterization model. Specifically, we parameterize the rank one matrix $Y^*$ by $XX^\top$, where $X\in R^{d\times d}$. We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of $O(\sigma^2/d)$, where $\sigma^2$ is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of $O(\sigma^2)$. Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training over-parameterized neural networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-liu22c, title = { Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably }, author = {Liu, Tianyi and Li, Yan and Zhou, Enlu and Zhao, Tuo}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {2784--2802}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/liu22c/liu22c.pdf}, url = {https://proceedings.mlr.press/v151/liu22c.html}, abstract = { We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix $Y^*\in R^{d\times d}$ from a noisy observation $Y$ using an over-parameterization model. Specifically, we parameterize the rank one matrix $Y^*$ by $XX^\top$, where $X\in R^{d\times d}$. We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of $O(\sigma^2/d)$, where $\sigma^2$ is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of $O(\sigma^2)$. Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training over-parameterized neural networks. } }
Endnote
%0 Conference Paper %T Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably %A Tianyi Liu %A Yan Li %A Enlu Zhou %A Tuo Zhao %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-liu22c %I PMLR %P 2784--2802 %U https://proceedings.mlr.press/v151/liu22c.html %V 151 %X We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix $Y^*\in R^{d\times d}$ from a noisy observation $Y$ using an over-parameterization model. Specifically, we parameterize the rank one matrix $Y^*$ by $XX^\top$, where $X\in R^{d\times d}$. We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of $O(\sigma^2/d)$, where $\sigma^2$ is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of $O(\sigma^2)$. Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training over-parameterized neural networks.
APA
Liu, T., Li, Y., Zhou, E. & Zhao, T.. (2022). Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:2784-2802 Available from https://proceedings.mlr.press/v151/liu22c.html.

Related Material