Locally Optimal Descent for Dynamic Stepsize Scheduling

Gilad Yehudai, Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:1099-1107, 2025.

Abstract

We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice. Our approach is based on estimating the locally-optimal stepsize, guaranteeing maximal descent in the direction of the stochastic gradient of the current step. We first establish theoretical convergence bounds for our method within the context of smooth non-convex stochastic optimization. We then present a practical implementation of our algorithm and conduct systematic experiments across diverse datasets and optimization algorithms, comparing our scheme with existing state-of-the-art learning-rate schedulers. Our findings indicate that our method needs minimal tuning when compared to existing approaches. Thus, removing the need for auxiliary manual schedules and warm-up phases and achieving comparable performance with drastically reduced parameter tuning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-yehudai25a, title = {Locally Optimal Descent for Dynamic Stepsize Scheduling}, author = {Yehudai, Gilad and Cohen, Alon and Daniely, Amit and Drori, Yoel and Koren, Tomer and Schain, Mariano}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {1099--1107}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/yehudai25a/yehudai25a.pdf}, url = {https://proceedings.mlr.press/v258/yehudai25a.html}, abstract = {We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice. Our approach is based on estimating the locally-optimal stepsize, guaranteeing maximal descent in the direction of the stochastic gradient of the current step. We first establish theoretical convergence bounds for our method within the context of smooth non-convex stochastic optimization. We then present a practical implementation of our algorithm and conduct systematic experiments across diverse datasets and optimization algorithms, comparing our scheme with existing state-of-the-art learning-rate schedulers. Our findings indicate that our method needs minimal tuning when compared to existing approaches. Thus, removing the need for auxiliary manual schedules and warm-up phases and achieving comparable performance with drastically reduced parameter tuning.} }
Endnote
%0 Conference Paper %T Locally Optimal Descent for Dynamic Stepsize Scheduling %A Gilad Yehudai %A Alon Cohen %A Amit Daniely %A Yoel Drori %A Tomer Koren %A Mariano Schain %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-yehudai25a %I PMLR %P 1099--1107 %U https://proceedings.mlr.press/v258/yehudai25a.html %V 258 %X We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice. Our approach is based on estimating the locally-optimal stepsize, guaranteeing maximal descent in the direction of the stochastic gradient of the current step. We first establish theoretical convergence bounds for our method within the context of smooth non-convex stochastic optimization. We then present a practical implementation of our algorithm and conduct systematic experiments across diverse datasets and optimization algorithms, comparing our scheme with existing state-of-the-art learning-rate schedulers. Our findings indicate that our method needs minimal tuning when compared to existing approaches. Thus, removing the need for auxiliary manual schedules and warm-up phases and achieving comparable performance with drastically reduced parameter tuning.
APA
Yehudai, G., Cohen, A., Daniely, A., Drori, Y., Koren, T. & Schain, M.. (2025). Locally Optimal Descent for Dynamic Stepsize Scheduling. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:1099-1107 Available from https://proceedings.mlr.press/v258/yehudai25a.html.

Related Material