[edit]
Primal-dual Learning for the Model-free Risk-constrained Linear Quadratic Regulator
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:702-714, 2021.
Abstract
Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model. In this work, we propose a model-free framework to learn a risk-aware controller of a linear system. We formulate it as a discrete-time infinite-horizon LQR problem with a state predictive variance constraint. Since its optimal policy is known as an affine feedback, i.e., $u^* = -Kx+l$, we alternatively optimize the gain pair $(K,l)$ by designing a primal-dual learning algorithm. First, we observe that the Lagrangian function enjoys an important local gradient dominance property. Based on it, we then show that there is no duality gap despite the non-convex optimization landscape. Furthermore, we propose a primal-dual algorithm with global convergence to learn the optimal policy-multiplier pair. Finally, we validate our results via simulations.