Learning LinearQuadratic Regulators Efficiently with only $\sqrtT$ Regret
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:13001309, 2019.
Abstract
We present the first computationallyefficient algorithm with $\widetilde{O}(\sqrt{T})$ regret for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve an open question of AbbasiYadkori and Szepesvari (2011) and Dean,Mania, Matni, Recht, and Tu (2018).
