Learning to Switch Optimizers for Quadratic Programming

Grant Getzelman, Prasanna Balaprakash
Proceedings of The 13th Asian Conference on Machine Learning, PMLR 157:1553-1568, 2021.

Abstract

Quadratic programming (QP) seeks to solve optimization problems involving quadratic functions that can include complex boundary constraints. QP in the unrestricted form is $\mathcal{NP}$-hard; but when restricted to the convex case, it becomes tractable. Active set and interior point methods are used to solve convex problems, and in the nonconvex case various heuristics or relaxations are used to produce high-quality solutions in finite time. Learning to optimize (L2O) is an emerging approach to design solvers for optimization problems. We develop an L2O approach that uses reinforcement learning to learn a stochastic policy to switch between pre-existing optimization algorithms to solve QP problem instances. In particular, our agent switches between three simple optimizers: Adam, gradient descent, and random search. Our experiments show that the learned optimizer minimizes quadratic functions faster and finds better-quality solutions in the long term than do any of the possible optimizers switched between. We also compare our solver with the standard QP algorithms in MATLAB and find better performance in fewer function evaluations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v157-getzelman21a, title = {Learning to Switch Optimizers for Quadratic Programming}, author = {Getzelman, Grant and Balaprakash, Prasanna}, booktitle = {Proceedings of The 13th Asian Conference on Machine Learning}, pages = {1553--1568}, year = {2021}, editor = {Balasubramanian, Vineeth N. and Tsang, Ivor}, volume = {157}, series = {Proceedings of Machine Learning Research}, month = {17--19 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v157/getzelman21a/getzelman21a.pdf}, url = {https://proceedings.mlr.press/v157/getzelman21a.html}, abstract = {Quadratic programming (QP) seeks to solve optimization problems involving quadratic functions that can include complex boundary constraints. QP in the unrestricted form is $\mathcal{NP}$-hard; but when restricted to the convex case, it becomes tractable. Active set and interior point methods are used to solve convex problems, and in the nonconvex case various heuristics or relaxations are used to produce high-quality solutions in finite time. Learning to optimize (L2O) is an emerging approach to design solvers for optimization problems. We develop an L2O approach that uses reinforcement learning to learn a stochastic policy to switch between pre-existing optimization algorithms to solve QP problem instances. In particular, our agent switches between three simple optimizers: Adam, gradient descent, and random search. Our experiments show that the learned optimizer minimizes quadratic functions faster and finds better-quality solutions in the long term than do any of the possible optimizers switched between. We also compare our solver with the standard QP algorithms in MATLAB and find better performance in fewer function evaluations.} }
Endnote
%0 Conference Paper %T Learning to Switch Optimizers for Quadratic Programming %A Grant Getzelman %A Prasanna Balaprakash %B Proceedings of The 13th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Vineeth N. Balasubramanian %E Ivor Tsang %F pmlr-v157-getzelman21a %I PMLR %P 1553--1568 %U https://proceedings.mlr.press/v157/getzelman21a.html %V 157 %X Quadratic programming (QP) seeks to solve optimization problems involving quadratic functions that can include complex boundary constraints. QP in the unrestricted form is $\mathcal{NP}$-hard; but when restricted to the convex case, it becomes tractable. Active set and interior point methods are used to solve convex problems, and in the nonconvex case various heuristics or relaxations are used to produce high-quality solutions in finite time. Learning to optimize (L2O) is an emerging approach to design solvers for optimization problems. We develop an L2O approach that uses reinforcement learning to learn a stochastic policy to switch between pre-existing optimization algorithms to solve QP problem instances. In particular, our agent switches between three simple optimizers: Adam, gradient descent, and random search. Our experiments show that the learned optimizer minimizes quadratic functions faster and finds better-quality solutions in the long term than do any of the possible optimizers switched between. We also compare our solver with the standard QP algorithms in MATLAB and find better performance in fewer function evaluations.
APA
Getzelman, G. & Balaprakash, P.. (2021). Learning to Switch Optimizers for Quadratic Programming. Proceedings of The 13th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 157:1553-1568 Available from https://proceedings.mlr.press/v157/getzelman21a.html.

Related Material