A Dynamical Systems Perspective on Nesterov Acceleration

Michael Muehlebach, Michael Jordan
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:4656-4662, 2019.

Abstract

We present a dynamical system framework for understanding Nesterov’s accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-muehlebach19a, title = {A Dynamical Systems Perspective on {N}esterov Acceleration}, author = {Muehlebach, Michael and Jordan, Michael}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {4656--4662}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/muehlebach19a/muehlebach19a.pdf}, url = {https://proceedings.mlr.press/v97/muehlebach19a.html}, abstract = {We present a dynamical system framework for understanding Nesterov’s accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.} }
Endnote
%0 Conference Paper %T A Dynamical Systems Perspective on Nesterov Acceleration %A Michael Muehlebach %A Michael Jordan %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-muehlebach19a %I PMLR %P 4656--4662 %U https://proceedings.mlr.press/v97/muehlebach19a.html %V 97 %X We present a dynamical system framework for understanding Nesterov’s accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.
APA
Muehlebach, M. & Jordan, M.. (2019). A Dynamical Systems Perspective on Nesterov Acceleration. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:4656-4662 Available from https://proceedings.mlr.press/v97/muehlebach19a.html.

Related Material