Multi-Task Imitation Learning for Linear Dynamical Systems

Thomas T. Zhang; Katie Kang; Bruce D Lee; Claire Tomlin; Sergey Levine; Stephen Tu; Nikolai Matni

Multi-Task Imitation Learning for Linear Dynamical Systems

Thomas T. Zhang, Katie Kang, Bruce D Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:586-599, 2023.

Abstract

We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared

$k$ -dimensional representation is learned from

$H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that the imitation gap over trajectories generated by the learned target policy is bounded by

$\tilde{O}\left( \frac{k n_x}{HN_{\mathrm{shared}}} + \frac{k n_u}{N_{\mathrm{target}}}\right)$ , where

$n_x > k$ is the state dimension,

$n_u$ is the input dimension,

$N_{\mathrm{shared}}$ denotes the total amount of data collected for each policy during representation learning, and

$N_{\mathrm{target}}$ is the amount of target task data. This result formalizes the intuition that aggregating data across related tasks to learn a representation can significantly improve the sample efficiency of learning a target task. The trends suggested by this bound are corroborated in simulation.

Cite this Paper

BibTeX


@InProceedings{pmlr-v211-zhang23b,
  title = 	 {Multi-Task Imitation Learning for Linear Dynamical Systems},
  author =       {Zhang, Thomas T. and Kang, Katie and Lee, Bruce D and Tomlin, Claire and Levine, Sergey and Tu, Stephen and Matni, Nikolai},
  booktitle = 	 {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  pages = 	 {586--599},
  year = 	 {2023},
  editor = 	 {Matni, Nikolai and Morari, Manfred and Pappas, George J.},
  volume = 	 {211},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--16 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v211/zhang23b/zhang23b.pdf},
  url = 	 {https://proceedings.mlr.press/v211/zhang23b.html},
  abstract = 	 {We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where  learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that the imitation gap over trajectories generated by the learned target policy is bounded by $\tilde{O}\left( \frac{k n_x}{HN_{\mathrm{shared}}} + \frac{k n_u}{N_{\mathrm{target}}}\right)$, where $n_x > k$ is the state dimension, $n_u$ is the input dimension, $N_{\mathrm{shared}}$ denotes the total amount of data collected for each policy during representation learning, and $N_{\mathrm{target}}$ is the amount of target task data. This result formalizes the intuition that aggregating data across related tasks to learn a representation can significantly improve the sample efficiency of learning a target task. The trends suggested by this bound are corroborated in simulation. }
}

Endnote

%0 Conference Paper
%T Multi-Task Imitation Learning for Linear Dynamical Systems
%A Thomas T. Zhang
%A Katie Kang
%A Bruce D Lee
%A Claire Tomlin
%A Sergey Levine
%A Stephen Tu
%A Nikolai Matni
%B Proceedings of The 5th Annual Learning for Dynamics and Control Conference
%C Proceedings of Machine Learning Research
%D 2023
%E Nikolai Matni
%E Manfred Morari
%E George J. Pappas	
%F pmlr-v211-zhang23b
%I PMLR
%P 586--599
%U https://proceedings.mlr.press/v211/zhang23b.html
%V 211
%X We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where  learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that the imitation gap over trajectories generated by the learned target policy is bounded by $\tilde{O}\left( \frac{k n_x}{HN_{\mathrm{shared}}} + \frac{k n_u}{N_{\mathrm{target}}}\right)$, where $n_x > k$ is the state dimension, $n_u$ is the input dimension, $N_{\mathrm{shared}}$ denotes the total amount of data collected for each policy during representation learning, and $N_{\mathrm{target}}$ is the amount of target task data. This result formalizes the intuition that aggregating data across related tasks to learn a representation can significantly improve the sample efficiency of learning a target task. The trends suggested by this bound are corroborated in simulation.

APA


Zhang, T.T., Kang, K., Lee, B.D., Tomlin, C., Levine, S., Tu, S. & Matni, N.. (2023). Multi-Task Imitation Learning for Linear Dynamical Systems. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:586-599 Available from https://proceedings.mlr.press/v211/zhang23b.html.

Multi-Task Imitation Learning for Linear Dynamical Systems

Abstract

Cite this Paper

Related Material