Online Linear Quadratic Control

Alon Cohen, Avinatan Hasidim, Tomer Koren, Nevena Lazic, Yishay Mansour, Kunal Talwar
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1029-1038, 2018.

Abstract

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-cohen18b, title = {Online Linear Quadratic Control}, author = {Cohen, Alon and Hasidim, Avinatan and Koren, Tomer and Lazic, Nevena and Mansour, Yishay and Talwar, Kunal}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1029--1038}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/cohen18b/cohen18b.pdf}, url = {http://proceedings.mlr.press/v80/cohen18b.html}, abstract = {We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.} }
Endnote
%0 Conference Paper %T Online Linear Quadratic Control %A Alon Cohen %A Avinatan Hasidim %A Tomer Koren %A Nevena Lazic %A Yishay Mansour %A Kunal Talwar %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-cohen18b %I PMLR %P 1029--1038 %U http://proceedings.mlr.press/v80/cohen18b.html %V 80 %X We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.
APA
Cohen, A., Hasidim, A., Koren, T., Lazic, N., Mansour, Y. & Talwar, K.. (2018). Online Linear Quadratic Control. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1029-1038 Available from http://proceedings.mlr.press/v80/cohen18b.html.

Related Material