Does Momentum Help in Stochastic Optimization? A Sample Complexity Analysis.

Swetha Ganesh; Rohan Deb; Gugan Thoppe; Amarjit Budhiraja

Does Momentum Help in Stochastic Optimization? A Sample Complexity Analysis.

Swetha Ganesh, Rohan Deb, Gugan Thoppe, Amarjit Budhiraja

Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:602-612, 2023.

Abstract

Stochastic Heavy Ball (SHB) and Nesterov’s Accelerated Stochastic Gradient (ASG) are popular momentum methods in optimization. While the benefits of these acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization are unclear. Several works have recently claimed that SHB and ASG always help in stochastic optimization. Our work shows that i.) these claims are either flawed or one-sided (e.g., consider only the bias term but not the variance), and ii.) when both these terms are accounted for, SHB and ASG do not always help. Specifically, for any quadratic optimization, we obtain a lower bound on the sample complexity of SHB and ASG, accounting for both bias and variance, and show that the vanilla SGD can achieve the same bound.

Cite this Paper

BibTeX


@InProceedings{pmlr-v216-ganesh23a,
  title = 	 {Does Momentum Help in Stochastic Optimization? {A} Sample Complexity Analysis.},
  author =       {Ganesh, Swetha and Deb, Rohan and Thoppe, Gugan and Budhiraja, Amarjit},
  booktitle = 	 {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {602--612},
  year = 	 {2023},
  editor = 	 {Evans, Robin J. and Shpitser, Ilya},
  volume = 	 {216},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {31 Jul--04 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v216/ganesh23a/ganesh23a.pdf},
  url = 	 {https://proceedings.mlr.press/v216/ganesh23a.html},
  abstract = 	 {Stochastic Heavy Ball (SHB) and Nesterov’s Accelerated Stochastic Gradient (ASG) are popular momentum methods in optimization. While the benefits of these acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization are unclear. Several works have recently claimed that SHB and ASG always help in stochastic optimization. Our work shows that i.) these claims are either flawed or one-sided (e.g., consider only the bias term but not the variance), and ii.) when both these terms are accounted for, SHB and ASG do not always help. Specifically, for any quadratic optimization, we obtain a lower bound on the sample complexity of SHB and ASG, accounting for both bias and variance, and show that the vanilla SGD can achieve the same bound.}
}

Endnote

%0 Conference Paper
%T Does Momentum Help in Stochastic Optimization? A Sample Complexity Analysis.
%A Swetha Ganesh
%A Rohan Deb
%A Gugan Thoppe
%A Amarjit Budhiraja
%B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2023
%E Robin J. Evans
%E Ilya Shpitser	
%F pmlr-v216-ganesh23a
%I PMLR
%P 602--612
%U https://proceedings.mlr.press/v216/ganesh23a.html
%V 216
%X Stochastic Heavy Ball (SHB) and Nesterov’s Accelerated Stochastic Gradient (ASG) are popular momentum methods in optimization. While the benefits of these acceleration ideas in deterministic settings are well understood, their advantages in stochastic optimization are unclear. Several works have recently claimed that SHB and ASG always help in stochastic optimization. Our work shows that i.) these claims are either flawed or one-sided (e.g., consider only the bias term but not the variance), and ii.) when both these terms are accounted for, SHB and ASG do not always help. Specifically, for any quadratic optimization, we obtain a lower bound on the sample complexity of SHB and ASG, accounting for both bias and variance, and show that the vanilla SGD can achieve the same bound.

APA


Ganesh, S., Deb, R., Thoppe, G. & Budhiraja, A.. (2023). Does Momentum Help in Stochastic Optimization? A Sample Complexity Analysis.. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:602-612 Available from https://proceedings.mlr.press/v216/ganesh23a.html.

Does Momentum Help in Stochastic Optimization? A Sample Complexity Analysis.

Abstract

Cite this Paper

Related Material