Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret

Gautam Goel, Naman Agarwal, Karan Singh, Elad Hazan
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:1345-1356, 2023.

Abstract

We consider the fundamental problem of online control of a linear dynamical system from two different viewpoints: regret minimization and competitive analysis. We prove that the optimal competitive policy is well-approximated by a convex parameterized policy class, known as a disturbance-action control (DAC) policies. Using this structural result, we show that several recently proposed online control algorithms achieve the best of both worlds: sublinear regret vs. the best DAC policy selected in hindsight, and optimal competitive ratio, up to an additive correction which grows sublinearly in the time horizon. We further conclude that sublinear regret vs. the optimal competitive policy is attainable when the linear dynamical system is unknown, and even when a stabilizing controller for the dynamics is not available a priori.

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-goel23a, title = {Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret}, author = {Goel, Gautam and Agarwal, Naman and Singh, Karan and Hazan, Elad}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {1345--1356}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/goel23a/goel23a.pdf}, url = {https://proceedings.mlr.press/v211/goel23a.html}, abstract = {We consider the fundamental problem of online control of a linear dynamical system from two different viewpoints: regret minimization and competitive analysis. We prove that the optimal competitive policy is well-approximated by a convex parameterized policy class, known as a disturbance-action control (DAC) policies. Using this structural result, we show that several recently proposed online control algorithms achieve the best of both worlds: sublinear regret vs. the best DAC policy selected in hindsight, and optimal competitive ratio, up to an additive correction which grows sublinearly in the time horizon. We further conclude that sublinear regret vs. the optimal competitive policy is attainable when the linear dynamical system is unknown, and even when a stabilizing controller for the dynamics is not available a priori. } }
Endnote
%0 Conference Paper %T Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret %A Gautam Goel %A Naman Agarwal %A Karan Singh %A Elad Hazan %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-goel23a %I PMLR %P 1345--1356 %U https://proceedings.mlr.press/v211/goel23a.html %V 211 %X We consider the fundamental problem of online control of a linear dynamical system from two different viewpoints: regret minimization and competitive analysis. We prove that the optimal competitive policy is well-approximated by a convex parameterized policy class, known as a disturbance-action control (DAC) policies. Using this structural result, we show that several recently proposed online control algorithms achieve the best of both worlds: sublinear regret vs. the best DAC policy selected in hindsight, and optimal competitive ratio, up to an additive correction which grows sublinearly in the time horizon. We further conclude that sublinear regret vs. the optimal competitive policy is attainable when the linear dynamical system is unknown, and even when a stabilizing controller for the dynamics is not available a priori.
APA
Goel, G., Agarwal, N., Singh, K. & Hazan, E.. (2023). Best of Both Worlds in Online Control: Competitive Ratio and Policy Regret. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:1345-1356 Available from https://proceedings.mlr.press/v211/goel23a.html.

Related Material