Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning

Philippe Hansen-Estruch, Amy Zhang, Ashvin Nair, Patrick Yin, Sergey Levine
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:8407-8426, 2022.

Abstract

Building generalizable goal-conditioned agents from rich observations is a key to reinforcement learning (RL) solving real world problems. Traditionally in goal-conditioned RL, an agent is provided with the exact goal they intend to reach. However, it is often not realistic to know the configuration of the goal before performing a task. A more scalable framework would allow us to provide the agent with an example of an analogous task, and have the agent then infer what the goal should be for its current state. We propose a new form of state abstraction called goal-conditioned bisimulation that captures functional equivariance, allowing for the reuse of skills to achieve new goals. We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in real world manipulation tasks. Further, we prove that this learned representation is sufficient not only for goal-conditioned tasks, but is amenable to any downstream task described by a state-only reward function.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-hansen-estruch22a, title = {Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning}, author = {Hansen-Estruch, Philippe and Zhang, Amy and Nair, Ashvin and Yin, Patrick and Levine, Sergey}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {8407--8426}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/hansen-estruch22a/hansen-estruch22a.pdf}, url = {https://proceedings.mlr.press/v162/hansen-estruch22a.html}, abstract = {Building generalizable goal-conditioned agents from rich observations is a key to reinforcement learning (RL) solving real world problems. Traditionally in goal-conditioned RL, an agent is provided with the exact goal they intend to reach. However, it is often not realistic to know the configuration of the goal before performing a task. A more scalable framework would allow us to provide the agent with an example of an analogous task, and have the agent then infer what the goal should be for its current state. We propose a new form of state abstraction called goal-conditioned bisimulation that captures functional equivariance, allowing for the reuse of skills to achieve new goals. We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in real world manipulation tasks. Further, we prove that this learned representation is sufficient not only for goal-conditioned tasks, but is amenable to any downstream task described by a state-only reward function.} }
Endnote
%0 Conference Paper %T Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning %A Philippe Hansen-Estruch %A Amy Zhang %A Ashvin Nair %A Patrick Yin %A Sergey Levine %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-hansen-estruch22a %I PMLR %P 8407--8426 %U https://proceedings.mlr.press/v162/hansen-estruch22a.html %V 162 %X Building generalizable goal-conditioned agents from rich observations is a key to reinforcement learning (RL) solving real world problems. Traditionally in goal-conditioned RL, an agent is provided with the exact goal they intend to reach. However, it is often not realistic to know the configuration of the goal before performing a task. A more scalable framework would allow us to provide the agent with an example of an analogous task, and have the agent then infer what the goal should be for its current state. We propose a new form of state abstraction called goal-conditioned bisimulation that captures functional equivariance, allowing for the reuse of skills to achieve new goals. We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in real world manipulation tasks. Further, we prove that this learned representation is sufficient not only for goal-conditioned tasks, but is amenable to any downstream task described by a state-only reward function.
APA
Hansen-Estruch, P., Zhang, A., Nair, A., Yin, P. & Levine, S.. (2022). Bisimulation Makes Analogies in Goal-Conditioned Reinforcement Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:8407-8426 Available from https://proceedings.mlr.press/v162/hansen-estruch22a.html.

Related Material