Automatic Differentiation of Optimization Algorithms with Time-Varying Updates

Sheheryar Mehmood, Peter Ochs
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:43581-43602, 2025.

Abstract

Numerous optimization algorithms have a time-varying update rule thanks to, for instance, a changing step size, momentum parameter or, Hessian approximation. Often, such algorithms are used as solvers for the lower-level problem in bilevel optimization, and are unrolled when computing the gradient of the upper-level objective. In this paper, we apply unrolled or automatic differentiation to a time-varying iterative process and provide convergence (rate) guarantees for the resulting derivative iterates. We then adapt these convergence results and apply them to proximal gradient descent with variable step size and FISTA when solving partly-smooth problems. We test the convergence (rates) of these algorithms numerically through several experiments. Our theoretical and numerical results show that the convergence rate of the algorithm is reflected in its derivative iterates.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-mehmood25a, title = {Automatic Differentiation of Optimization Algorithms with Time-Varying Updates}, author = {Mehmood, Sheheryar and Ochs, Peter}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {43581--43602}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/mehmood25a/mehmood25a.pdf}, url = {https://proceedings.mlr.press/v267/mehmood25a.html}, abstract = {Numerous optimization algorithms have a time-varying update rule thanks to, for instance, a changing step size, momentum parameter or, Hessian approximation. Often, such algorithms are used as solvers for the lower-level problem in bilevel optimization, and are unrolled when computing the gradient of the upper-level objective. In this paper, we apply unrolled or automatic differentiation to a time-varying iterative process and provide convergence (rate) guarantees for the resulting derivative iterates. We then adapt these convergence results and apply them to proximal gradient descent with variable step size and FISTA when solving partly-smooth problems. We test the convergence (rates) of these algorithms numerically through several experiments. Our theoretical and numerical results show that the convergence rate of the algorithm is reflected in its derivative iterates.} }
Endnote
%0 Conference Paper %T Automatic Differentiation of Optimization Algorithms with Time-Varying Updates %A Sheheryar Mehmood %A Peter Ochs %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-mehmood25a %I PMLR %P 43581--43602 %U https://proceedings.mlr.press/v267/mehmood25a.html %V 267 %X Numerous optimization algorithms have a time-varying update rule thanks to, for instance, a changing step size, momentum parameter or, Hessian approximation. Often, such algorithms are used as solvers for the lower-level problem in bilevel optimization, and are unrolled when computing the gradient of the upper-level objective. In this paper, we apply unrolled or automatic differentiation to a time-varying iterative process and provide convergence (rate) guarantees for the resulting derivative iterates. We then adapt these convergence results and apply them to proximal gradient descent with variable step size and FISTA when solving partly-smooth problems. We test the convergence (rates) of these algorithms numerically through several experiments. Our theoretical and numerical results show that the convergence rate of the algorithm is reflected in its derivative iterates.
APA
Mehmood, S. & Ochs, P.. (2025). Automatic Differentiation of Optimization Algorithms with Time-Varying Updates. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:43581-43602 Available from https://proceedings.mlr.press/v267/mehmood25a.html.

Related Material