Meta-learning linear quadratic regulators: a policy gradient MAML approach for model-free LQR

Leonardo Felipe Toso, Donglin Zhan, James Anderson, Han Wang
Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:902-915, 2024.

Abstract

We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a stabilizing controller close to each task-specific optimal controller up to a task-heterogeneity bias in both model-based and model-free learning scenarios. Moreover, in the model-based setting, we show that such a controller is achieved with a linear convergence rate, which improves upon sub-linear rates from existing work. Our theoretical guarantees demonstrate that the learned controller can efficiently adapt to unseen LQR tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v242-toso24a, title = {Meta-learning linear quadratic regulators: {A} policy gradient {MAML} approach for model-free {LQR}}, author = {Toso, Leonardo Felipe and Zhan, Donglin and Anderson, James and Wang, Han}, booktitle = {Proceedings of the 6th Annual Learning for Dynamics & Control Conference}, pages = {902--915}, year = {2024}, editor = {Abate, Alessandro and Cannon, Mark and Margellos, Kostas and Papachristodoulou, Antonis}, volume = {242}, series = {Proceedings of Machine Learning Research}, month = {15--17 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v242/toso24a/toso24a.pdf}, url = {https://proceedings.mlr.press/v242/toso24a.html}, abstract = {We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a stabilizing controller close to each task-specific optimal controller up to a task-heterogeneity bias in both model-based and model-free learning scenarios. Moreover, in the model-based setting, we show that such a controller is achieved with a linear convergence rate, which improves upon sub-linear rates from existing work. Our theoretical guarantees demonstrate that the learned controller can efficiently adapt to unseen LQR tasks.} }
Endnote
%0 Conference Paper %T Meta-learning linear quadratic regulators: a policy gradient MAML approach for model-free LQR %A Leonardo Felipe Toso %A Donglin Zhan %A James Anderson %A Han Wang %B Proceedings of the 6th Annual Learning for Dynamics & Control Conference %C Proceedings of Machine Learning Research %D 2024 %E Alessandro Abate %E Mark Cannon %E Kostas Margellos %E Antonis Papachristodoulou %F pmlr-v242-toso24a %I PMLR %P 902--915 %U https://proceedings.mlr.press/v242/toso24a.html %V 242 %X We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a stabilizing controller close to each task-specific optimal controller up to a task-heterogeneity bias in both model-based and model-free learning scenarios. Moreover, in the model-based setting, we show that such a controller is achieved with a linear convergence rate, which improves upon sub-linear rates from existing work. Our theoretical guarantees demonstrate that the learned controller can efficiently adapt to unseen LQR tasks.
APA
Toso, L.F., Zhan, D., Anderson, J. & Wang, H.. (2024). Meta-learning linear quadratic regulators: a policy gradient MAML approach for model-free LQR. Proceedings of the 6th Annual Learning for Dynamics & Control Conference, in Proceedings of Machine Learning Research 242:902-915 Available from https://proceedings.mlr.press/v242/toso24a.html.

Related Material