Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Salma Tarmoun; Guilherme Franca; Benjamin D Haeffele; Rene Vidal

Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Salma Tarmoun, Guilherme Franca, Benjamin D Haeffele, Rene Vidal

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10153-10161, 2021.

Abstract

We provide a detailed analysis of the dynamics ofthe gradient flow in overparameterized two-layerlinear models. A particularly interesting featureof this model is that its nonlinear dynamics can beexactly solved as a consequence of a large num-ber of conservation laws that constrain the systemto follow particular trajectories. More precisely,the gradient flow preserves the difference of theGramian matrices of the input and output weights,and its convergence to equilibrium depends onboth the magnitude of that difference (which isfixed at initialization) and the spectrum of the data.In addition, and generalizing prior work, we proveour results without assuming small, balanced orspectral initialization for the weights. Moreover,we establish interesting mathematical connectionsbetween matrix factorization problems and differ-ential equations of the Riccati type.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-tarmoun21a,
  title = 	 {Understanding the Dynamics of Gradient Flow in Overparameterized Linear models},
  author =       {Tarmoun, Salma and Franca, Guilherme and Haeffele, Benjamin D and Vidal, Rene},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {10153--10161},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/tarmoun21a/tarmoun21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/tarmoun21a.html},
  abstract = 	 {We provide a detailed analysis of the dynamics ofthe gradient flow in overparameterized two-layerlinear models. A particularly interesting featureof this model is that its nonlinear dynamics can beexactly solved as a consequence of a large num-ber of conservation laws that constrain the systemto follow particular trajectories. More precisely,the gradient flow preserves the difference of theGramian matrices of the input and output weights,and its convergence to equilibrium depends onboth the magnitude of that difference (which isfixed at initialization) and the spectrum of the data.In addition, and generalizing prior work, we proveour results without assuming small, balanced orspectral initialization for the weights. Moreover,we establish interesting mathematical connectionsbetween matrix factorization problems and differ-ential equations of the Riccati type.}
}

Endnote

%0 Conference Paper
%T Understanding the Dynamics of Gradient Flow in Overparameterized Linear models
%A Salma Tarmoun
%A Guilherme Franca
%A Benjamin D Haeffele
%A Rene Vidal
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-tarmoun21a
%I PMLR
%P 10153--10161
%U https://proceedings.mlr.press/v139/tarmoun21a.html
%V 139
%X We provide a detailed analysis of the dynamics ofthe gradient flow in overparameterized two-layerlinear models. A particularly interesting featureof this model is that its nonlinear dynamics can beexactly solved as a consequence of a large num-ber of conservation laws that constrain the systemto follow particular trajectories. More precisely,the gradient flow preserves the difference of theGramian matrices of the input and output weights,and its convergence to equilibrium depends onboth the magnitude of that difference (which isfixed at initialization) and the spectrum of the data.In addition, and generalizing prior work, we proveour results without assuming small, balanced orspectral initialization for the weights. Moreover,we establish interesting mathematical connectionsbetween matrix factorization problems and differ-ential equations of the Riccati type.

APA

Tarmoun, S., Franca, G., Haeffele, B.D. & Vidal, R.. (2021). Understanding the Dynamics of Gradient Flow in Overparameterized Linear models. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10153-10161 Available from https://proceedings.mlr.press/v139/tarmoun21a.html.

Understanding the Dynamics of Gradient Flow in Overparameterized Linear models

Abstract

Cite this Paper

Related Material