Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

Jikai Jin; Zhiyuan Li; Kaifeng Lyu; Simon Shaolei Du; Jason D. Lee

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon Shaolei Du, Jason D. Lee

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:15200-15238, 2023.

Abstract

It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models. This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem, whose goal is to recover a low-rank ground-truth matrix from near-isotropic linear measurements. It is shown that GD with small initialization behaves similarly to the greedy low-rank learning heuristics and follows an incremental learning procedure: GD sequentially learns solutions with increasing ranks until it recovers the ground truth matrix. Compared to existing works which only analyze the first learning phase for rank-1 solutions, our result provides characterizations for the whole learning process. Moreover, besides the over-parameterized regime that many prior works focused on, our analysis of the incremental learning procedure also applies to the under-parameterized regime. Finally, we conduct numerical experiments to confirm our theoretical findings.

Cite this Paper

BibTeX

@InProceedings{pmlr-v202-jin23a,
  title = 	 {Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing},
  author =       {Jin, Jikai and Li, Zhiyuan and Lyu, Kaifeng and Du, Simon Shaolei and Lee, Jason D.},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {15200--15238},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/jin23a/jin23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/jin23a.html},
  abstract = 	 {It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models. This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem, whose goal is to recover a low-rank ground-truth matrix from near-isotropic linear measurements. It is shown that GD with small initialization behaves similarly to the greedy low-rank learning heuristics and follows an incremental learning procedure: GD sequentially learns solutions with increasing ranks until it recovers the ground truth matrix. Compared to existing works which only analyze the first learning phase for rank-1 solutions, our result provides characterizations for the whole learning process. Moreover, besides the over-parameterized regime that many prior works focused on, our analysis of the incremental learning procedure also applies to the under-parameterized regime. Finally, we conduct numerical experiments to confirm our theoretical findings.}
}

Endnote

%0 Conference Paper
%T Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing
%A Jikai Jin
%A Zhiyuan Li
%A Kaifeng Lyu
%A Simon Shaolei Du
%A Jason D. Lee
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-jin23a
%I PMLR
%P 15200--15238
%U https://proceedings.mlr.press/v202/jin23a.html
%V 202
%X It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models. This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem, whose goal is to recover a low-rank ground-truth matrix from near-isotropic linear measurements. It is shown that GD with small initialization behaves similarly to the greedy low-rank learning heuristics and follows an incremental learning procedure: GD sequentially learns solutions with increasing ranks until it recovers the ground truth matrix. Compared to existing works which only analyze the first learning phase for rank-1 solutions, our result provides characterizations for the whole learning process. Moreover, besides the over-parameterized regime that many prior works focused on, our analysis of the incremental learning procedure also applies to the under-parameterized regime. Finally, we conduct numerical experiments to confirm our theoretical findings.

APA

Jin, J., Li, Z., Lyu, K., Du, S.S. & Lee, J.D.. (2023). Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:15200-15238 Available from https://proceedings.mlr.press/v202/jin23a.html.

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

Abstract

Cite this Paper

Related Material