Optimizing Optimizers: Regret-optimal gradient descent algorithms

Philippe Casgrain, Anastasis Kratsios
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:883-926, 2021.

Abstract

This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm’s performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics which we show is equivalent to performing \emph{dual-preconditioned gradient descent} on the value function generated by its regret. Using these optimal dynamics, we provide bounds on their rates of convergence to solutions of convex optimization problems. Though closed-form optimal dynamics cannot be obtained in general, we present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret. These are benchmarked against commonly used optimization algorithms to demonstrate their effectiveness.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-casgrain21a, title = {Optimizing Optimizers: Regret-optimal gradient descent algorithms}, author = {Casgrain, Philippe and Kratsios, Anastasis}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {883--926}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/casgrain21a/casgrain21a.pdf}, url = {https://proceedings.mlr.press/v134/casgrain21a.html}, abstract = {This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm’s performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics which we show is equivalent to performing \emph{dual-preconditioned gradient descent} on the value function generated by its regret. Using these optimal dynamics, we provide bounds on their rates of convergence to solutions of convex optimization problems. Though closed-form optimal dynamics cannot be obtained in general, we present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret. These are benchmarked against commonly used optimization algorithms to demonstrate their effectiveness.} }
Endnote
%0 Conference Paper %T Optimizing Optimizers: Regret-optimal gradient descent algorithms %A Philippe Casgrain %A Anastasis Kratsios %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-casgrain21a %I PMLR %P 883--926 %U https://proceedings.mlr.press/v134/casgrain21a.html %V 134 %X This paper treats the task of designing optimization algorithms as an optimal control problem. Using regret as a metric for an algorithm’s performance, we study the existence, uniqueness and consistency of regret-optimal algorithms. By providing first-order optimality conditions for the control problem, we show that regret-optimal algorithms must satisfy a specific structure in their dynamics which we show is equivalent to performing \emph{dual-preconditioned gradient descent} on the value function generated by its regret. Using these optimal dynamics, we provide bounds on their rates of convergence to solutions of convex optimization problems. Though closed-form optimal dynamics cannot be obtained in general, we present fast numerical methods for approximating them, generating optimization algorithms which directly optimize their long-term regret. These are benchmarked against commonly used optimization algorithms to demonstrate their effectiveness.
APA
Casgrain, P. & Kratsios, A.. (2021). Optimizing Optimizers: Regret-optimal gradient descent algorithms. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:883-926 Available from https://proceedings.mlr.press/v134/casgrain21a.html.

Related Material