Curvature-corrected learning dynamics in deep neural networks

Dongsung Huh

Curvature-corrected learning dynamics in deep neural networks

Dongsung Huh

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:4552-4560, 2020.

Abstract

Deep neural networks exhibit complex learning dynamics due to their non-convex loss landscapes. Second-order optimization methods facilitate learning dynamics by compensating for ill-conditioned curvature. In this work, we investigate how curvature correction modifies the learning dynamics in deep linear neural networks and provide analytical solutions. We derive a generalized conservation law that preserves the path of parameter dynamics from curvature correction, which shows that curvature correction only modifies the temporal profiles of dynamics along the path. We show that while curvature correction accelerates the convergence dynamics of the input-output map, it can also negatively affect the generalization performance. Our analysis also reveals an undesirable effect of curvature correction that compromises stability of parameters dynamics during learning, especially with block-diagonal approximation of natural gradient descent. We introduce fractional curvature correction that resolves this problem while retaining most of the acceleration benefits of full curvature correction.

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-huh20a,
  title = 	 {Curvature-corrected learning dynamics in deep neural networks},
  author =       {Huh, Dongsung},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {4552--4560},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/huh20a/huh20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/huh20a.html},
  abstract = 	 {Deep neural networks exhibit complex learning dynamics due to their non-convex loss landscapes. Second-order optimization methods facilitate learning dynamics by compensating for ill-conditioned curvature. In this work, we investigate how curvature correction modifies the learning dynamics in deep linear neural networks and provide analytical solutions. We derive a generalized conservation law that preserves the path of parameter dynamics from curvature correction, which shows that curvature correction only modifies the temporal profiles of dynamics along the path. We show that while curvature correction accelerates the convergence dynamics of the input-output map, it can also negatively affect the generalization performance. Our analysis also reveals an undesirable effect of curvature correction that compromises stability of parameters dynamics during learning, especially with block-diagonal approximation of natural gradient descent. We introduce fractional curvature correction that resolves this problem while retaining most of the acceleration benefits of full curvature correction.}
}

Endnote

%0 Conference Paper
%T Curvature-corrected learning dynamics in deep neural networks
%A Dongsung Huh
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-huh20a
%I PMLR
%P 4552--4560
%U https://proceedings.mlr.press/v119/huh20a.html
%V 119
%X Deep neural networks exhibit complex learning dynamics due to their non-convex loss landscapes. Second-order optimization methods facilitate learning dynamics by compensating for ill-conditioned curvature. In this work, we investigate how curvature correction modifies the learning dynamics in deep linear neural networks and provide analytical solutions. We derive a generalized conservation law that preserves the path of parameter dynamics from curvature correction, which shows that curvature correction only modifies the temporal profiles of dynamics along the path. We show that while curvature correction accelerates the convergence dynamics of the input-output map, it can also negatively affect the generalization performance. Our analysis also reveals an undesirable effect of curvature correction that compromises stability of parameters dynamics during learning, especially with block-diagonal approximation of natural gradient descent. We introduce fractional curvature correction that resolves this problem while retaining most of the acceleration benefits of full curvature correction.

APA

Huh, D.. (2020). Curvature-corrected learning dynamics in deep neural networks. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:4552-4560 Available from https://proceedings.mlr.press/v119/huh20a.html.

Curvature-corrected learning dynamics in deep neural networks

Abstract

Cite this Paper

Related Material