Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning

Lucas Manuelli, Yunzhu Li, Pete Florence, Russ Tedrake
Proceedings of the 2020 Conference on Robot Learning, PMLR 155:693-710, 2021.

Abstract

Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. https://sites.google.com/view/keypointsintothefuture

Cite this Paper


BibTeX
@InProceedings{pmlr-v155-manuelli21a, title = {Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning}, author = {Manuelli, Lucas and Li, Yunzhu and Florence, Pete and Tedrake, Russ}, booktitle = {Proceedings of the 2020 Conference on Robot Learning}, pages = {693--710}, year = {2021}, editor = {Kober, Jens and Ramos, Fabio and Tomlin, Claire}, volume = {155}, series = {Proceedings of Machine Learning Research}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v155/manuelli21a/manuelli21a.pdf}, url = {https://proceedings.mlr.press/v155/manuelli21a.html}, abstract = {Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. https://sites.google.com/view/keypointsintothefuture} }
Endnote
%0 Conference Paper %T Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning %A Lucas Manuelli %A Yunzhu Li %A Pete Florence %A Russ Tedrake %B Proceedings of the 2020 Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2021 %E Jens Kober %E Fabio Ramos %E Claire Tomlin %F pmlr-v155-manuelli21a %I PMLR %P 693--710 %U https://proceedings.mlr.press/v155/manuelli21a.html %V 155 %X Predictive models have been at the core of many robotic systems, from quadrotors to walking robots. However, it has been challenging to develop and apply such models to practical robotic manipulation due to high-dimensional sensory observations such as images. Previous approaches to learning models in the context of robotic manipulation have either learned whole image dynamics or used autoencoders to learn dynamics in a low-dimensional latent state. In this work, we introduce model-based prediction with self-supervised visual correspondence learning, and show that not only is this indeed possible, but demonstrate that these types of predictive models show compelling performance improvements over alternative methods for vision-based RL with autoencoder-type vision training. Through simulation experiments, we demonstrate that our models provide better generalization precision, particularly in 3D scenes, scenes involving occlusion, and in category-generalization. Additionally, we validate that our method effectively transfers to the real world through hardware experiments. https://sites.google.com/view/keypointsintothefuture
APA
Manuelli, L., Li, Y., Florence, P. & Tedrake, R.. (2021). Keypoints into the Future: Self-Supervised Correspondence in Model-Based Reinforcement Learning. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:693-710 Available from https://proceedings.mlr.press/v155/manuelli21a.html.

Related Material