A Lyapunov analysis for accelerated gradient methods: from deterministic to stochastic case
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:602-612, 2020.
Recent work by Su, Boyd and Candes made a connection between Nesterov’s accelerated gradient descent method and an ordinary differential equation (ODE). We show that this connection can be extended to the case of stochastic gradients, and develop Lyapunov function based convergence rates proof for Nesterov’s accelerated stochastic gradient descent. In the gradient case, we show Nesterov’s method arises as a straightforward discretization of a modified ODE. Established Lyapunov analysis is used to recover the accelerated rates of convergence in both continuous and discrete time. Moreover, the Lyapunov analysis can be extended to the case of stochastic gradients. The result is a unified approach to accelerationin both continuous and discrete time, and in for both stochastic and full gradients.