Online Linear Quadratic Control

Alon Cohen; Avinatan Hasidim; Tomer Koren; Nevena Lazic; Yishay Mansour; Kunal Talwar

Online Linear Quadratic Control

Alon Cohen, Avinatan Hasidim, Tomer Koren, Nevena Lazic, Yishay Mansour, Kunal Talwar

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1029-1038, 2018.

Abstract

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee

$O(\sqrt{T})$ regret under mild assumptions, where

$T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-cohen18b,
  title = 	 {Online Linear Quadratic Control},
  author =       {Cohen, Alon and Hasidim, Avinatan and Koren, Tomer and Lazic, Nevena and Mansour, Yishay and Talwar, Kunal},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {1029--1038},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/cohen18b/cohen18b.pdf},
  url = 	 {https://proceedings.mlr.press/v80/cohen18b.html},
  abstract = 	 {We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.}
}

Endnote

%0 Conference Paper
%T Online Linear Quadratic Control
%A Alon Cohen
%A Avinatan Hasidim
%A Tomer Koren
%A Nevena Lazic
%A Yishay Mansour
%A Kunal Talwar
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-cohen18b
%I PMLR
%P 1029--1038
%U https://proceedings.mlr.press/v80/cohen18b.html
%V 80
%X We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O(\sqrt{T})$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to “strongly stable” policies that mix exponentially fast to a steady state.

APA


Cohen, A., Hasidim, A., Koren, T., Lazic, N., Mansour, Y. & Talwar, K.. (2018). Online Linear Quadratic Control. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1029-1038 Available from https://proceedings.mlr.press/v80/cohen18b.html.

Online Linear Quadratic Control

Abstract

Cite this Paper

Related Material