Learning Temporally AbstractWorld Models without Online Experimentation

Benjamin Freed, Siddarth Venkatraman, Guillaume Adrien Sartoretti, Jeff Schneider, Howie Choset
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:10338-10356, 2023.

Abstract

Agents that can build temporally abstract representations of their environment are better able to understand their world and make plans on extended time scales, with limited computational power and modeling capacity. However, existing methods for automatically learning temporally abstract world models usually require millions of online environmental interactions and incentivize agents to reach every accessible environmental state, which is infeasible for most real-world robots both in terms of data efficiency and hardware safety. In this paper, we present an approach for simultaneously learning sets of skills and temporally abstract, skill-conditioned world models purely from offline data, enabling agents to perform zero-shot online planning of skill sequences for new tasks. We show that our approach performs comparably to or better than a wide array of state-of-the-art offline RL algorithms on a number of simulated robotics locomotion and manipulation benchmarks, while offering a higher degree of adaptability to new goals. Finally, we show that our approach offers a much higher degree of robustness to perturbations in environmental dynamics, compared to policy-based methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-freed23a, title = {Learning Temporally {A}bstract{W}orld Models without Online Experimentation}, author = {Freed, Benjamin and Venkatraman, Siddarth and Sartoretti, Guillaume Adrien and Schneider, Jeff and Choset, Howie}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {10338--10356}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/freed23a/freed23a.pdf}, url = {https://proceedings.mlr.press/v202/freed23a.html}, abstract = {Agents that can build temporally abstract representations of their environment are better able to understand their world and make plans on extended time scales, with limited computational power and modeling capacity. However, existing methods for automatically learning temporally abstract world models usually require millions of online environmental interactions and incentivize agents to reach every accessible environmental state, which is infeasible for most real-world robots both in terms of data efficiency and hardware safety. In this paper, we present an approach for simultaneously learning sets of skills and temporally abstract, skill-conditioned world models purely from offline data, enabling agents to perform zero-shot online planning of skill sequences for new tasks. We show that our approach performs comparably to or better than a wide array of state-of-the-art offline RL algorithms on a number of simulated robotics locomotion and manipulation benchmarks, while offering a higher degree of adaptability to new goals. Finally, we show that our approach offers a much higher degree of robustness to perturbations in environmental dynamics, compared to policy-based methods.} }
Endnote
%0 Conference Paper %T Learning Temporally AbstractWorld Models without Online Experimentation %A Benjamin Freed %A Siddarth Venkatraman %A Guillaume Adrien Sartoretti %A Jeff Schneider %A Howie Choset %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-freed23a %I PMLR %P 10338--10356 %U https://proceedings.mlr.press/v202/freed23a.html %V 202 %X Agents that can build temporally abstract representations of their environment are better able to understand their world and make plans on extended time scales, with limited computational power and modeling capacity. However, existing methods for automatically learning temporally abstract world models usually require millions of online environmental interactions and incentivize agents to reach every accessible environmental state, which is infeasible for most real-world robots both in terms of data efficiency and hardware safety. In this paper, we present an approach for simultaneously learning sets of skills and temporally abstract, skill-conditioned world models purely from offline data, enabling agents to perform zero-shot online planning of skill sequences for new tasks. We show that our approach performs comparably to or better than a wide array of state-of-the-art offline RL algorithms on a number of simulated robotics locomotion and manipulation benchmarks, while offering a higher degree of adaptability to new goals. Finally, we show that our approach offers a much higher degree of robustness to perturbations in environmental dynamics, compared to policy-based methods.
APA
Freed, B., Venkatraman, S., Sartoretti, G.A., Schneider, J. & Choset, H.. (2023). Learning Temporally AbstractWorld Models without Online Experimentation. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:10338-10356 Available from https://proceedings.mlr.press/v202/freed23a.html.

Related Material