Anytime Acceleration of Gradient Descent

Zihan Zhang, Jason Lee, Simon Du, Yuxin Chen
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:5991-6013, 2025.

Abstract

This work investigates stepsize-based acceleration of gradient descent with anytime convergence guarantees. For smooth (non-strongly) convex optimization, we propose a stepsize schedule that allows gradient descent to achieve convergence guarantees of $O\big(T^{-\frac{2\log_2\rho}{1+\log_2\rho}}\big) \approx O(T^{-1.119})$ for any stopping time $T$, where $\rho=\sqrt{2}+1$ is the silver ratio and the stepsize schedule is predetermined without prior knowledge of the stopping time. This result provides an affirmative answer to a COLT open problem regarding whether stepsize-based acceleration can yield anytime convergence rates of $o(T^{-1})$. We further extend our theory to yield anytime convergence guarantees of $\exp(-\Omega(T/\kappa^{0.893}))$ for smooth and strongly convex optimization, with $\kappa$ being the condition number.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-zhang25a, title = {Anytime Acceleration of Gradient Descent}, author = {Zhang, Zihan and Lee, Jason and Du, Simon and Chen, Yuxin}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {5991--6013}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/zhang25a/zhang25a.pdf}, url = {https://proceedings.mlr.press/v291/zhang25a.html}, abstract = {This work investigates stepsize-based acceleration of gradient descent with anytime convergence guarantees. For smooth (non-strongly) convex optimization, we propose a stepsize schedule that allows gradient descent to achieve convergence guarantees of $O\big(T^{-\frac{2\log_2\rho}{1+\log_2\rho}}\big) \approx O(T^{-1.119})$ for any stopping time $T$, where $\rho=\sqrt{2}+1$ is the silver ratio and the stepsize schedule is predetermined without prior knowledge of the stopping time. This result provides an affirmative answer to a COLT open problem regarding whether stepsize-based acceleration can yield anytime convergence rates of $o(T^{-1})$. We further extend our theory to yield anytime convergence guarantees of $\exp(-\Omega(T/\kappa^{0.893}))$ for smooth and strongly convex optimization, with $\kappa$ being the condition number. } }
Endnote
%0 Conference Paper %T Anytime Acceleration of Gradient Descent %A Zihan Zhang %A Jason Lee %A Simon Du %A Yuxin Chen %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-zhang25a %I PMLR %P 5991--6013 %U https://proceedings.mlr.press/v291/zhang25a.html %V 291 %X This work investigates stepsize-based acceleration of gradient descent with anytime convergence guarantees. For smooth (non-strongly) convex optimization, we propose a stepsize schedule that allows gradient descent to achieve convergence guarantees of $O\big(T^{-\frac{2\log_2\rho}{1+\log_2\rho}}\big) \approx O(T^{-1.119})$ for any stopping time $T$, where $\rho=\sqrt{2}+1$ is the silver ratio and the stepsize schedule is predetermined without prior knowledge of the stopping time. This result provides an affirmative answer to a COLT open problem regarding whether stepsize-based acceleration can yield anytime convergence rates of $o(T^{-1})$. We further extend our theory to yield anytime convergence guarantees of $\exp(-\Omega(T/\kappa^{0.893}))$ for smooth and strongly convex optimization, with $\kappa$ being the condition number.
APA
Zhang, Z., Lee, J., Du, S. & Chen, Y.. (2025). Anytime Acceleration of Gradient Descent. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:5991-6013 Available from https://proceedings.mlr.press/v291/zhang25a.html.

Related Material