Open Problem: Anytime Convergence Rate of Gradient Descent

Guy Kornowski, Ohad Shamir
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:5335-5339, 2024.

Abstract

Recent results show that vanilla gradient descent can be accelerated for smooth convex objectives, merely by changing the stepsize sequence. We show that this can lead to surprisingly large errors indefinitely, and therefore ask: Is there any stepsize schedule for gradient descent that accelerates the classic $\mathcal{O}(1/T)$ convergence rate, at \emph{any} stopping time $T$?

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-kornowski24a, title = {Open Problem: Anytime Convergence Rate of Gradient Descent}, author = {Kornowski, Guy and Shamir, Ohad}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {5335--5339}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/kornowski24a/kornowski24a.pdf}, url = {https://proceedings.mlr.press/v247/kornowski24a.html}, abstract = {Recent results show that vanilla gradient descent can be accelerated for smooth convex objectives, merely by changing the stepsize sequence. We show that this can lead to surprisingly large errors indefinitely, and therefore ask: Is there any stepsize schedule for gradient descent that accelerates the classic $\mathcal{O}(1/T)$ convergence rate, at \emph{any} stopping time $T$?} }
Endnote
%0 Conference Paper %T Open Problem: Anytime Convergence Rate of Gradient Descent %A Guy Kornowski %A Ohad Shamir %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-kornowski24a %I PMLR %P 5335--5339 %U https://proceedings.mlr.press/v247/kornowski24a.html %V 247 %X Recent results show that vanilla gradient descent can be accelerated for smooth convex objectives, merely by changing the stepsize sequence. We show that this can lead to surprisingly large errors indefinitely, and therefore ask: Is there any stepsize schedule for gradient descent that accelerates the classic $\mathcal{O}(1/T)$ convergence rate, at \emph{any} stopping time $T$?
APA
Kornowski, G. & Shamir, O.. (2024). Open Problem: Anytime Convergence Rate of Gradient Descent. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:5335-5339 Available from https://proceedings.mlr.press/v247/kornowski24a.html.

Related Material