Regret Bounds for the Adaptive Control of Linear Quadratic Systems

Yasin Abbasi-Yadkori; Csaba Szepesvári

Regret Bounds for the Adaptive Control of Linear Quadratic Systems

Yasin Abbasi-Yadkori, Csaba Szepesvári

Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:1-26, 2011.

Abstract

We study the average cost Linear Quadratic (LQ) control problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that apart from logarithmic factors its regret up to time $T$ is $O(\sqrt{T})$. Unlike previous approaches that use a forced-exploration scheme, we construct a high-probability confidence set around the model parameters and design an algorithm that plays optimistically with respect to this confidence set. The construction of the confidence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To the best of our knowledge this is the the first time that a regret bound is derived for the LQ control problem.

Cite this Paper

BibTeX


@InProceedings{pmlr-v19-abbasi-yadkori11a,
  title = 	 {Regret Bounds for the Adaptive Control of Linear Quadratic Systems},
  author =       {Abbasi-Yadkori, Yasin and Szepesv\'ari, Csaba},
  booktitle = 	 {Proceedings of the 24th Annual Conference on Learning Theory},
  pages = 	 {1--26},
  year = 	 {2011},
  editor = 	 {Kakade, Sham M. and von Luxburg, Ulrike},
  volume = 	 {19},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Budapest, Hungary},
  month = 	 {09--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v19/abbasi-yadkori11a/abbasi-yadkori11a.pdf},
  url = 	 {https://proceedings.mlr.press/v19/abbasi-yadkori11a.html},
  abstract = 	 {We study the average cost Linear Quadratic (LQ) control problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that apart from logarithmic factors its regret up to time $T$ is $O(\sqrt{T})$. Unlike previous approaches that use a forced-exploration scheme, we construct a high-probability confidence set around the model parameters and design an algorithm that plays optimistically with respect to this confidence set. The construction of the confidence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To the best of our knowledge this is the the first time that a regret bound is derived for the LQ control problem.}
}

Endnote

%0 Conference Paper
%T Regret Bounds for the Adaptive Control of Linear Quadratic Systems
%A Yasin Abbasi-Yadkori
%A Csaba Szepesvári
%B Proceedings of the 24th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2011
%E Sham M. Kakade
%E Ulrike von Luxburg	
%F pmlr-v19-abbasi-yadkori11a
%I PMLR
%P 1--26
%U https://proceedings.mlr.press/v19/abbasi-yadkori11a.html
%V 19
%X We study the average cost Linear Quadratic (LQ) control problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that apart from logarithmic factors its regret up to time $T$ is $O(\sqrt{T})$. Unlike previous approaches that use a forced-exploration scheme, we construct a high-probability confidence set around the model parameters and design an algorithm that plays optimistically with respect to this confidence set. The construction of the confidence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To the best of our knowledge this is the the first time that a regret bound is derived for the LQ control problem.

RIS


TY  - CPAPER
TI  - Regret Bounds for the Adaptive Control of Linear Quadratic Systems
AU  - Yasin Abbasi-Yadkori
AU  - Csaba Szepesvári
BT  - Proceedings of the 24th Annual Conference on Learning Theory
DA  - 2011/12/21
ED  - Sham M. Kakade
ED  - Ulrike von Luxburg	
ID  - pmlr-v19-abbasi-yadkori11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 19
SP  - 1
EP  - 26
L1  - http://proceedings.mlr.press/v19/abbasi-yadkori11a/abbasi-yadkori11a.pdf
UR  - https://proceedings.mlr.press/v19/abbasi-yadkori11a.html
AB  - We study the average cost Linear Quadratic (LQ) control problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that apart from logarithmic factors its regret up to time $T$ is $O(\sqrt{T})$. Unlike previous approaches that use a forced-exploration scheme, we construct a high-probability confidence set around the model parameters and design an algorithm that plays optimistically with respect to this confidence set. The construction of the confidence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To the best of our knowledge this is the the first time that a regret bound is derived for the LQ control problem.
ER  -

APA


Abbasi-Yadkori, Y. & Szepesvári, C.. (2011). Regret Bounds for the Adaptive Control of Linear Quadratic Systems. Proceedings of the 24th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 19:1-26 Available from https://proceedings.mlr.press/v19/abbasi-yadkori11a.html.

Related Material

Download PDF