Hallucinative Topological Memory for Zero-Shot Visual Planning

Kara Liu, Thanard Kurutach, Christine Tung, Pieter Abbeel, Aviv Tamar
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:6259-6270, 2020.

Abstract

In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e.g., images obtained from self-supervised robot interaction. Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans, and difficult training algorithms. Here, instead, we propose a simple VP method that plans directly in image space and displays competitive performance. We build on the semi-parametric topological memory (SPTM) method: image samples are treated as nodes in a graph, the graph connectivity is learned from image sequence data, and planning can be performed using conventional graph search methods. We propose two modifications on SPTM. First, we train an energy-based graph connectivity function using contrastive predictive coding that admits stable training. Second, to allow zero-shot planning in new domains, we learn a conditional VAE model that generates images given a context describing the domain, and use these hallucinated samples for building the connectivity graph and planning. We show that this simple approach significantly outperform the SOTA VP methods, in terms of both plan interpretability and success rate when using the plan to guide a trajectory-following controller. Interestingly, our method can pick up non-trivial visual properties of objects, such as their geometry, and account for it in the plans.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-liu20h, title = {Hallucinative Topological Memory for Zero-Shot Visual Planning}, author = {Liu, Kara and Kurutach, Thanard and Tung, Christine and Abbeel, Pieter and Tamar, Aviv}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {6259--6270}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/liu20h/liu20h.pdf}, url = {https://proceedings.mlr.press/v119/liu20h.html}, abstract = {In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e.g., images obtained from self-supervised robot interaction. Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans, and difficult training algorithms. Here, instead, we propose a simple VP method that plans directly in image space and displays competitive performance. We build on the semi-parametric topological memory (SPTM) method: image samples are treated as nodes in a graph, the graph connectivity is learned from image sequence data, and planning can be performed using conventional graph search methods. We propose two modifications on SPTM. First, we train an energy-based graph connectivity function using contrastive predictive coding that admits stable training. Second, to allow zero-shot planning in new domains, we learn a conditional VAE model that generates images given a context describing the domain, and use these hallucinated samples for building the connectivity graph and planning. We show that this simple approach significantly outperform the SOTA VP methods, in terms of both plan interpretability and success rate when using the plan to guide a trajectory-following controller. Interestingly, our method can pick up non-trivial visual properties of objects, such as their geometry, and account for it in the plans.} }
Endnote
%0 Conference Paper %T Hallucinative Topological Memory for Zero-Shot Visual Planning %A Kara Liu %A Thanard Kurutach %A Christine Tung %A Pieter Abbeel %A Aviv Tamar %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-liu20h %I PMLR %P 6259--6270 %U https://proceedings.mlr.press/v119/liu20h.html %V 119 %X In visual planning (VP), an agent learns to plan goal-directed behavior from observations of a dynamical system obtained offline, e.g., images obtained from self-supervised robot interaction. Most previous works on VP approached the problem by planning in a learned latent space, resulting in low-quality visual plans, and difficult training algorithms. Here, instead, we propose a simple VP method that plans directly in image space and displays competitive performance. We build on the semi-parametric topological memory (SPTM) method: image samples are treated as nodes in a graph, the graph connectivity is learned from image sequence data, and planning can be performed using conventional graph search methods. We propose two modifications on SPTM. First, we train an energy-based graph connectivity function using contrastive predictive coding that admits stable training. Second, to allow zero-shot planning in new domains, we learn a conditional VAE model that generates images given a context describing the domain, and use these hallucinated samples for building the connectivity graph and planning. We show that this simple approach significantly outperform the SOTA VP methods, in terms of both plan interpretability and success rate when using the plan to guide a trajectory-following controller. Interestingly, our method can pick up non-trivial visual properties of objects, such as their geometry, and account for it in the plans.
APA
Liu, K., Kurutach, T., Tung, C., Abbeel, P. & Tamar, A.. (2020). Hallucinative Topological Memory for Zero-Shot Visual Planning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:6259-6270 Available from https://proceedings.mlr.press/v119/liu20h.html.

Related Material