From Averaging to Acceleration, There is Only a Step-size

Nicolas Flammarion; Francis Bach

From Averaging to Acceleration, There is Only a Step-size

Nicolas Flammarion, Francis Bach

Proceedings of The 28th Conference on Learning Theory, PMLR 40:658-695, 2015.

Abstract

We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for quadratic non-strongly-convex problems may be reformulated as constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n^2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system, showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration.

Cite this Paper

BibTeX


@InProceedings{pmlr-v40-Flammarion15,
  title = 	 {From Averaging to Acceleration, There is Only a Step-size},
  author = 	 {Flammarion, Nicolas and Bach, Francis},
  booktitle = 	 {Proceedings of The 28th Conference on Learning Theory},
  pages = 	 {658--695},
  year = 	 {2015},
  editor = 	 {Grünwald, Peter and Hazan, Elad and Kale, Satyen},
  volume = 	 {40},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Paris, France},
  month = 	 {03--06 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v40/Flammarion15.pdf},
  url = 	 {https://proceedings.mlr.press/v40/Flammarion15.html},
  abstract = 	 {We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for quadratic non-strongly-convex problems may be reformulated as   constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n^2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system, showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration.}
}

Endnote

%0 Conference Paper
%T From Averaging to Acceleration, There is Only a Step-size
%A Nicolas Flammarion
%A Francis Bach
%B Proceedings of The 28th Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2015
%E Peter Grünwald
%E Elad Hazan
%E Satyen Kale	
%F pmlr-v40-Flammarion15
%I PMLR
%P 658--695
%U https://proceedings.mlr.press/v40/Flammarion15.html
%V 40
%X We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for quadratic non-strongly-convex problems may be reformulated as   constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n^2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system, showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration.

RIS


TY  - CPAPER
TI  - From Averaging to Acceleration, There is Only a Step-size
AU  - Nicolas Flammarion
AU  - Francis Bach
BT  - Proceedings of The 28th Conference on Learning Theory
DA  - 2015/06/26
ED  - Peter Grünwald
ED  - Elad Hazan
ED  - Satyen Kale	
ID  - pmlr-v40-Flammarion15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 40
SP  - 658
EP  - 695
L1  - http://proceedings.mlr.press/v40/Flammarion15.pdf
UR  - https://proceedings.mlr.press/v40/Flammarion15.html
AB  - We show that accelerated gradient descent, averaged gradient descent and the heavy-ball method for quadratic non-strongly-convex problems may be reformulated as   constant parameter second-order difference equation algorithms, where stability of the system is equivalent to convergence at rate O(1/n^2), where n is the number of iterations. We provide a detailed analysis of the eigenvalues of the corresponding linear dynamical system, showing various oscillatory and non-oscillatory behaviors, together with a sharp stability result with explicit constants. We also consider the situation where noisy gradients are available, where we extend our general convergence result, which suggests an alternative algorithm (i.e., with different step sizes) that exhibits the good aspects of both averaging and acceleration.
ER  -

APA


Flammarion, N. & Bach, F.. (2015). From Averaging to Acceleration, There is Only a Step-size. Proceedings of The 28th Conference on Learning Theory, in Proceedings of Machine Learning Research 40:658-695 Available from https://proceedings.mlr.press/v40/Flammarion15.html.

Related Material

Download PDF