[edit]
Learning Linear-Quadratic Regulators Efficiently with only $\sqrtT$ Regret
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:1300-1309, 2019.
Abstract
We present the first computationally-efficient algorithm with $\widetilde{O}(\sqrt{T})$ regret for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve an open question of Abbasi-Yadkori and Szepesvari (2011) and Dean,Mania, Matni, Recht, and Tu (2018).