Model Predictive Control via On-Policy Imitation Learning

Kwangjun Ahn, Zakaria Mhammedi, Horia Mania, Zhang-Wei Hong, Ali Jadbabaie
Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:1493-1505, 2023.

Abstract

In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying samples from an expert. Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system. Behavior cloning, however, is a method that is known to be data inefficient and suffer from distribution shifts. As an alternative, we develop a variant of the forward training algorithm which is an on-policy imitation learning method proposed by (Ross et al. 2010). Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance. We validate our results through simulations and show that the forward training algorithm is indeed superior to behavior cloning when applied to MPC.

Cite this Paper


BibTeX
@InProceedings{pmlr-v211-ahn23a, title = {Model Predictive Control via On-Policy Imitation Learning}, author = {Ahn, Kwangjun and Mhammedi, Zakaria and Mania, Horia and Hong, Zhang-Wei and Jadbabaie, Ali}, booktitle = {Proceedings of The 5th Annual Learning for Dynamics and Control Conference}, pages = {1493--1505}, year = {2023}, editor = {Matni, Nikolai and Morari, Manfred and Pappas, George J.}, volume = {211}, series = {Proceedings of Machine Learning Research}, month = {15--16 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v211/ahn23a/ahn23a.pdf}, url = {https://proceedings.mlr.press/v211/ahn23a.html}, abstract = {In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying samples from an expert. Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system. Behavior cloning, however, is a method that is known to be data inefficient and suffer from distribution shifts. As an alternative, we develop a variant of the forward training algorithm which is an on-policy imitation learning method proposed by (Ross et al. 2010). Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance. We validate our results through simulations and show that the forward training algorithm is indeed superior to behavior cloning when applied to MPC.} }
Endnote
%0 Conference Paper %T Model Predictive Control via On-Policy Imitation Learning %A Kwangjun Ahn %A Zakaria Mhammedi %A Horia Mania %A Zhang-Wei Hong %A Ali Jadbabaie %B Proceedings of The 5th Annual Learning for Dynamics and Control Conference %C Proceedings of Machine Learning Research %D 2023 %E Nikolai Matni %E Manfred Morari %E George J. Pappas %F pmlr-v211-ahn23a %I PMLR %P 1493--1505 %U https://proceedings.mlr.press/v211/ahn23a.html %V 211 %X In this paper, we leverage the rapid advances in imitation learning, a topic of intense recent focus in the Reinforcement Learning (RL) literature, to develop new sample complexity results and performance guarantees for data-driven Model Predictive Control (MPC) for constrained linear systems. In its simplest form, imitation learning is an approach that tries to learn an expert policy by querying samples from an expert. Recent approaches to data-driven MPC have used the simplest form of imitation learning known as behavior cloning to learn controllers that mimic the performance of MPC by online sampling of the trajectories of the closed-loop MPC system. Behavior cloning, however, is a method that is known to be data inefficient and suffer from distribution shifts. As an alternative, we develop a variant of the forward training algorithm which is an on-policy imitation learning method proposed by (Ross et al. 2010). Our algorithm uses the structure of constrained linear MPC, and our analysis uses the properties of the explicit MPC solution to theoretically bound the number of online MPC trajectories needed to achieve optimal performance. We validate our results through simulations and show that the forward training algorithm is indeed superior to behavior cloning when applied to MPC.
APA
Ahn, K., Mhammedi, Z., Mania, H., Hong, Z. & Jadbabaie, A.. (2023). Model Predictive Control via On-Policy Imitation Learning. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:1493-1505 Available from https://proceedings.mlr.press/v211/ahn23a.html.

Related Material