Domain Adaptation Using System Invariant Dynamics Models

Sean J. Wang, Aaron M. Johnson
Proceedings of the 3rd Conference on Learning for Dynamics and Control, PMLR 144:1130-1141, 2021.

Abstract

Reinforcement learning requires large amounts of training data. For many systems, especially mobile robots, collecting this training data can be expensive and time consuming. We propose a novel domain adaptation method to reduce the amount of training data needed for model-based reinforcement learning methods to train policies for a target system. Using our method, the required amount of target system training data can be reduced by collecting data on a proxy system with similar, but not identical, dynamics on which training data is cheaper to collect. Our method models the underlying dynamics shared between the two systems using a System Invariant Dynamics Model (SIDM), and models each system’s relationship to the SIDM using encoders and decoders. When only limited amounts of target system training data is available, using target and proxy data to train the SIDM, encoders, and decoders can lead to more accurate dynamics models for the target system than using target system data alone. We demonstrate this approach using simulated wheeled robots driving over rough terrain, varying dynamics parameters between the target and proxy system, and find a reduction of 5-20x in the amount of data needed for these systems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v144-wang21c, title = {Domain Adaptation Using System Invariant Dynamics Models}, author = {Wang, Sean J. and Johnson, Aaron M.}, booktitle = {Proceedings of the 3rd Conference on Learning for Dynamics and Control}, pages = {1130--1141}, year = {2021}, editor = {Jadbabaie, Ali and Lygeros, John and Pappas, George J. and A. Parrilo, Pablo and Recht, Benjamin and Tomlin, Claire J. and Zeilinger, Melanie N.}, volume = {144}, series = {Proceedings of Machine Learning Research}, month = {07 -- 08 June}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v144/wang21c/wang21c.pdf}, url = {https://proceedings.mlr.press/v144/wang21c.html}, abstract = {Reinforcement learning requires large amounts of training data. For many systems, especially mobile robots, collecting this training data can be expensive and time consuming. We propose a novel domain adaptation method to reduce the amount of training data needed for model-based reinforcement learning methods to train policies for a target system. Using our method, the required amount of target system training data can be reduced by collecting data on a proxy system with similar, but not identical, dynamics on which training data is cheaper to collect. Our method models the underlying dynamics shared between the two systems using a System Invariant Dynamics Model (SIDM), and models each system’s relationship to the SIDM using encoders and decoders. When only limited amounts of target system training data is available, using target and proxy data to train the SIDM, encoders, and decoders can lead to more accurate dynamics models for the target system than using target system data alone. We demonstrate this approach using simulated wheeled robots driving over rough terrain, varying dynamics parameters between the target and proxy system, and find a reduction of 5-20x in the amount of data needed for these systems.} }
Endnote
%0 Conference Paper %T Domain Adaptation Using System Invariant Dynamics Models %A Sean J. Wang %A Aaron M. Johnson %B Proceedings of the 3rd Conference on Learning for Dynamics and Control %C Proceedings of Machine Learning Research %D 2021 %E Ali Jadbabaie %E John Lygeros %E George J. Pappas %E Pablo A. Parrilo %E Benjamin Recht %E Claire J. Tomlin %E Melanie N. Zeilinger %F pmlr-v144-wang21c %I PMLR %P 1130--1141 %U https://proceedings.mlr.press/v144/wang21c.html %V 144 %X Reinforcement learning requires large amounts of training data. For many systems, especially mobile robots, collecting this training data can be expensive and time consuming. We propose a novel domain adaptation method to reduce the amount of training data needed for model-based reinforcement learning methods to train policies for a target system. Using our method, the required amount of target system training data can be reduced by collecting data on a proxy system with similar, but not identical, dynamics on which training data is cheaper to collect. Our method models the underlying dynamics shared between the two systems using a System Invariant Dynamics Model (SIDM), and models each system’s relationship to the SIDM using encoders and decoders. When only limited amounts of target system training data is available, using target and proxy data to train the SIDM, encoders, and decoders can lead to more accurate dynamics models for the target system than using target system data alone. We demonstrate this approach using simulated wheeled robots driving over rough terrain, varying dynamics parameters between the target and proxy system, and find a reduction of 5-20x in the amount of data needed for these systems.
APA
Wang, S.J. & Johnson, A.M.. (2021). Domain Adaptation Using System Invariant Dynamics Models. Proceedings of the 3rd Conference on Learning for Dynamics and Control, in Proceedings of Machine Learning Research 144:1130-1141 Available from https://proceedings.mlr.press/v144/wang21c.html.

Related Material