Learning LinearQuadratic Regulators Efficiently with only $\sqrtT$ Regret
[edit]
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:13001309, 2019.
Abstract
We present the first computationallyefficient algorithm with $\widetilde{O}(\sqrt{T})$ regret for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve an open question of AbbasiYadkori and Szepesvari (2011) and Dean,Mania, Matni, Recht, and Tu (2018).
Related Material


