Gradient Methods with Online Scaling

Wenzhi Gao; Ya-Chi Chu; Yinyu Ye; Madeleine Udell

Gradient Methods with Online Scaling

Wenzhi Gao, Ya-Chi Chu, Yinyu Ye, Madeleine Udell

Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:2192-2226, 2025.

Abstract

We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to scale the gradient at each iteration through an online learning algorithm and provably accelerates gradient-based methods asymptotically. In contrast with previous literature, where convergence is established based on worst-case analysis, our framework provides a strong convergence guarantee with respect to the optimal stepsize for the iteration trajectory. For smooth strongly convex optimization, our framework provides an $\Ocal(\kappa^\star \log(1/\varepsilon)$) asymptotic complexity result, where $\kappa^\star$ is the condition number achievable by the optimal preconditioner, improving on the previous $\Ocal(\sqrt{n}\kappa^\star \log(1/\varepsilon))$ result. For smooth convex optimization, we obtain the first convergence guarantee for the widely used hypergradient descent heuristic.

Cite this Paper

BibTeX

@InProceedings{pmlr-v291-gao25a,
  title = 	 {Gradient Methods with Online Scaling},
  author =       {Gao, Wenzhi and Chu, Ya-Chi and Ye, Yinyu and Udell, Madeleine},
  booktitle = 	 {Proceedings of Thirty Eighth Conference on Learning Theory},
  pages = 	 {2192--2226},
  year = 	 {2025},
  editor = 	 {Haghtalab, Nika and Moitra, Ankur},
  volume = 	 {291},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Jun--04 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v291/main/assets/gao25a/gao25a.pdf},
  url = 	 {https://proceedings.mlr.press/v291/gao25a.html},
  abstract = 	 {We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to scale the gradient at each iteration through an online learning algorithm and provably accelerates gradient-based methods asymptotically. In contrast with previous literature, where convergence is established based on worst-case analysis, our framework provides a strong convergence guarantee with respect to the optimal stepsize for the iteration trajectory. For smooth strongly convex optimization, our framework provides an $\Ocal(\kappa^\star \log(1/\varepsilon)$) asymptotic complexity result, where $\kappa^\star$ is the condition number achievable by the optimal preconditioner, improving on the previous $\Ocal(\sqrt{n}\kappa^\star \log(1/\varepsilon))$ result. For smooth convex optimization, we obtain the first convergence guarantee for the widely used hypergradient descent heuristic.}
}

Endnote

%0 Conference Paper
%T Gradient Methods with Online Scaling
%A Wenzhi Gao
%A Ya-Chi Chu
%A Yinyu Ye
%A Madeleine Udell
%B Proceedings of Thirty Eighth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2025
%E Nika Haghtalab
%E Ankur Moitra	
%F pmlr-v291-gao25a
%I PMLR
%P 2192--2226
%U https://proceedings.mlr.press/v291/gao25a.html
%V 291
%X We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to scale the gradient at each iteration through an online learning algorithm and provably accelerates gradient-based methods asymptotically. In contrast with previous literature, where convergence is established based on worst-case analysis, our framework provides a strong convergence guarantee with respect to the optimal stepsize for the iteration trajectory. For smooth strongly convex optimization, our framework provides an $\Ocal(\kappa^\star \log(1/\varepsilon)$) asymptotic complexity result, where $\kappa^\star$ is the condition number achievable by the optimal preconditioner, improving on the previous $\Ocal(\sqrt{n}\kappa^\star \log(1/\varepsilon))$ result. For smooth convex optimization, we obtain the first convergence guarantee for the widely used hypergradient descent heuristic.

APA

Gao, W., Chu, Y., Ye, Y. & Udell, M.. (2025). Gradient Methods with Online Scaling. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:2192-2226 Available from https://proceedings.mlr.press/v291/gao25a.html.

Related Material

Download PDF