Model-Based Planning with Energy-Based Models

Yilun Du, Toru Lin, Igor Mordatch
Proceedings of the Conference on Robot Learning, PMLR 100:374-383, 2020.

Abstract

Model-based planning holds great promise for improving both sample efficiency and generalization in reinforcement learning (RL). We show that energy-based models (EBMs) are a promising class of models to use for model-based planning. EBMs naturally support inference of intermediate states given start and goal state distributions. We provide an online algorithm to train EBMs while interacting with the environment, and show that EBMs allow for significantly better online learning than corresponding feed-forward networks. We further show that EBMs support maximum entropy state inference and are able to generate diverse state space plans. We show that inference purely in state space - without planning actions - allows for better generalization to previously unseen obstacles in the environment and prevents the planner from exploiting the dynamics model by applying uncharacteristic action sequences. Finally, we show that online EBM training naturally leads to intentionally planned state exploration which performs significantly better than random exploration.

Cite this Paper


BibTeX
@InProceedings{pmlr-v100-du20a, title = {Model-Based Planning with Energy-Based Models}, author = {Du, Yilun and Lin, Toru and Mordatch, Igor}, booktitle = {Proceedings of the Conference on Robot Learning}, pages = {374--383}, year = {2020}, editor = {Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei}, volume = {100}, series = {Proceedings of Machine Learning Research}, month = {30 Oct--01 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v100/du20a/du20a.pdf}, url = {https://proceedings.mlr.press/v100/du20a.html}, abstract = {Model-based planning holds great promise for improving both sample efficiency and generalization in reinforcement learning (RL). We show that energy-based models (EBMs) are a promising class of models to use for model-based planning. EBMs naturally support inference of intermediate states given start and goal state distributions. We provide an online algorithm to train EBMs while interacting with the environment, and show that EBMs allow for significantly better online learning than corresponding feed-forward networks. We further show that EBMs support maximum entropy state inference and are able to generate diverse state space plans. We show that inference purely in state space - without planning actions - allows for better generalization to previously unseen obstacles in the environment and prevents the planner from exploiting the dynamics model by applying uncharacteristic action sequences. Finally, we show that online EBM training naturally leads to intentionally planned state exploration which performs significantly better than random exploration.} }
Endnote
%0 Conference Paper %T Model-Based Planning with Energy-Based Models %A Yilun Du %A Toru Lin %A Igor Mordatch %B Proceedings of the Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2020 %E Leslie Pack Kaelbling %E Danica Kragic %E Komei Sugiura %F pmlr-v100-du20a %I PMLR %P 374--383 %U https://proceedings.mlr.press/v100/du20a.html %V 100 %X Model-based planning holds great promise for improving both sample efficiency and generalization in reinforcement learning (RL). We show that energy-based models (EBMs) are a promising class of models to use for model-based planning. EBMs naturally support inference of intermediate states given start and goal state distributions. We provide an online algorithm to train EBMs while interacting with the environment, and show that EBMs allow for significantly better online learning than corresponding feed-forward networks. We further show that EBMs support maximum entropy state inference and are able to generate diverse state space plans. We show that inference purely in state space - without planning actions - allows for better generalization to previously unseen obstacles in the environment and prevents the planner from exploiting the dynamics model by applying uncharacteristic action sequences. Finally, we show that online EBM training naturally leads to intentionally planned state exploration which performs significantly better than random exploration.
APA
Du, Y., Lin, T. & Mordatch, I.. (2020). Model-Based Planning with Energy-Based Models. Proceedings of the Conference on Robot Learning, in Proceedings of Machine Learning Research 100:374-383 Available from https://proceedings.mlr.press/v100/du20a.html.

Related Material