Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics

Matthias Weissenbacher, Samarth Sinha, Animesh Garg, Kawahara Yoshinobu
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:23645-23667, 2022.

Abstract

Offline reinforcement learning leverages large datasets to train policies without interactions with the environment. The learned policies may then be deployed in real-world settings where interactions are costly or dangerous. Current algorithms over-fit to the training dataset and as a consequence perform poorly when deployed to out-of-distribution generalizations of the environment. We aim to address these limitations by learning a Koopman latent representation which allows us to infer symmetries of the system’s underlying dynamic. The latter is then utilized to extend the otherwise static offline dataset during training; this constitutes a novel data augmentation framework which reflects the system’s dynamic and is thus to be interpreted as an exploration of the environments phase space. To obtain the symmetries we employ Koopman theory in which nonlinear dynamics are represented in terms of a linear operator acting on the space of measurement functions of the system. We provide novel theoretical results on the existence and nature of symmetries relevant for control systems such as reinforcement learning settings. Moreover, we empirically evaluate our method on several benchmark offline reinforcement learning tasks and datasets including D4RL, Metaworld and Robosuite and find that by using our framework we consistently improve the state-of-the-art of model-free Q-learning methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-weissenbacher22a, title = {Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics}, author = {Weissenbacher, Matthias and Sinha, Samarth and Garg, Animesh and Yoshinobu, Kawahara}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {23645--23667}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/weissenbacher22a/weissenbacher22a.pdf}, url = {https://proceedings.mlr.press/v162/weissenbacher22a.html}, abstract = {Offline reinforcement learning leverages large datasets to train policies without interactions with the environment. The learned policies may then be deployed in real-world settings where interactions are costly or dangerous. Current algorithms over-fit to the training dataset and as a consequence perform poorly when deployed to out-of-distribution generalizations of the environment. We aim to address these limitations by learning a Koopman latent representation which allows us to infer symmetries of the system’s underlying dynamic. The latter is then utilized to extend the otherwise static offline dataset during training; this constitutes a novel data augmentation framework which reflects the system’s dynamic and is thus to be interpreted as an exploration of the environments phase space. To obtain the symmetries we employ Koopman theory in which nonlinear dynamics are represented in terms of a linear operator acting on the space of measurement functions of the system. We provide novel theoretical results on the existence and nature of symmetries relevant for control systems such as reinforcement learning settings. Moreover, we empirically evaluate our method on several benchmark offline reinforcement learning tasks and datasets including D4RL, Metaworld and Robosuite and find that by using our framework we consistently improve the state-of-the-art of model-free Q-learning methods.} }
Endnote
%0 Conference Paper %T Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics %A Matthias Weissenbacher %A Samarth Sinha %A Animesh Garg %A Kawahara Yoshinobu %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-weissenbacher22a %I PMLR %P 23645--23667 %U https://proceedings.mlr.press/v162/weissenbacher22a.html %V 162 %X Offline reinforcement learning leverages large datasets to train policies without interactions with the environment. The learned policies may then be deployed in real-world settings where interactions are costly or dangerous. Current algorithms over-fit to the training dataset and as a consequence perform poorly when deployed to out-of-distribution generalizations of the environment. We aim to address these limitations by learning a Koopman latent representation which allows us to infer symmetries of the system’s underlying dynamic. The latter is then utilized to extend the otherwise static offline dataset during training; this constitutes a novel data augmentation framework which reflects the system’s dynamic and is thus to be interpreted as an exploration of the environments phase space. To obtain the symmetries we employ Koopman theory in which nonlinear dynamics are represented in terms of a linear operator acting on the space of measurement functions of the system. We provide novel theoretical results on the existence and nature of symmetries relevant for control systems such as reinforcement learning settings. Moreover, we empirically evaluate our method on several benchmark offline reinforcement learning tasks and datasets including D4RL, Metaworld and Robosuite and find that by using our framework we consistently improve the state-of-the-art of model-free Q-learning methods.
APA
Weissenbacher, M., Sinha, S., Garg, A. & Yoshinobu, K.. (2022). Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:23645-23667 Available from https://proceedings.mlr.press/v162/weissenbacher22a.html.

Related Material