Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator

Feiran Zhao, Keyou You
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:702-714, 2021.

Abstract

Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller of a linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. Since its optimal policy is known as an affine feedback, i.e., $u^* = -Kx+l$, we alternatively optimize the gain pair $(K,l)$ by designing a primal-dual learning algorithm. First, we observe that the Lagrangian function enjoys an important local gradient dominance property. Based on it, we then show that there is no duality gap despite the non-convex optimization landscape. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v144-zhao21b, title = {Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator}, author = {Zhao, Feiran and You, Keyou}, booktitle = {Proceedings of the 3rd Conference on Learning for Dynamics and Control}, pages = {702--714}, year = {2021}, editor = {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A. Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.}, volume = {144}, series = {Proceedings of Machine Learning Research}, month = {07 -- 08 June}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v144/zhao21b/zhao21b.pdf}, url = {https://proceedings.mlr.press/v144/zhao21b.html}, abstract = {Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller of a linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. Since its optimal policy is known as an affine feedback, i.e., $u^* = -Kx+l$, we alternatively optimize the gain pair $(K,l)$ by designing a primal-dual learning algorithm. First, we observe that the Lagrangian function enjoys an important local gradient dominance property. Based on it, we then show that there is no duality gap despite the non-convex optimization landscape. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations.} }
Endnote
%0 Conference Paper %T Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator %A Feiran Zhao %A Keyou You %B Proceedings of the 3rd Conference on Learning for Dynamics and Control %C Proceedings of Machine Learning Research %D 2021 %E Ali Jadbabaie %E John Lygeros %E George J. Pappas %E Pablo A. Parrilo %E Benjamin Recht %E Claire J. Tomlin %E Melanie N. Zeilinger %F pmlr-v144-zhao21b %I PMLR %P 702--714 %U https://proceedings.mlr.press/v144/zhao21b.html %V 144 %X Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller of a linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. Since its optimal policy is known as an affine feedback, i.e., $u^* = -Kx+l$, we alternatively optimize the gain pair $(K,l)$ by designing a primal-dual learning algorithm. First, we observe that the Lagrangian function enjoys an important local gradient dominance property. Based on it, we then show that there is no duality gap despite the non-convex optimization landscape. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations.
APA
Zhao, F. & You, K.. (2021). Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator. Proceedings of the 3rd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 144:702-714 Available from https://proceedings.mlr.press/v144/zhao21b.html.

Related Material