MIRA: Mental Imagery for Robotic Affordances

Yen-Chen Lin, Pete Florence, Andy Zeng, Jonathan T. Barron, Yilun Du, Wei-Chiu Ma, Anthony Simeonov, Alberto Rodriguez Garcia, Phillip Isola
Proceedings of The 6th Conference on Robot Learning, PMLR 205:1916-1927, 2023.

Abstract

Humans form mental images of 3D scenes to support counterfactual imagination, planning, and motor control. Our abilities to predict the appearance and affordance of the scene from previously unobserved viewpoints aid us in performing manipulation tasks (e.g., 6-DoF kitting) with a level of ease that is currently out of reach for existing robot learning frameworks. In this work, we aim to build artificial systems that can analogously plan actions on top of imagined images. To this end, we introduce Mental Imagery for Robotic Affordances (MIRA), an action reasoning framework that optimizes actions with novel-view synthesis and affordance prediction in the loop. Given a set of 2D RGB images, MIRA builds a consistent 3D scene representation, through which we synthesize novel orthographic views amenable to pixel-wise affordances prediction for action optimization. We illustrate how this optimization process enables us to generalize to unseen out-of-plane rotations for 6-DoF robotic manipulation tasks given a limited number of demonstrations, paving the way toward machines that autonomously learn to understand the world around them for planning actions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-lin23c, title = {MIRA: Mental Imagery for Robotic Affordances}, author = {Lin, Yen-Chen and Florence, Pete and Zeng, Andy and Barron, Jonathan T. and Du, Yilun and Ma, Wei-Chiu and Simeonov, Anthony and Garcia, Alberto Rodriguez and Isola, Phillip}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {1916--1927}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/lin23c/lin23c.pdf}, url = {https://proceedings.mlr.press/v205/lin23c.html}, abstract = {Humans form mental images of 3D scenes to support counterfactual imagination, planning, and motor control. Our abilities to predict the appearance and affordance of the scene from previously unobserved viewpoints aid us in performing manipulation tasks (e.g., 6-DoF kitting) with a level of ease that is currently out of reach for existing robot learning frameworks. In this work, we aim to build artificial systems that can analogously plan actions on top of imagined images. To this end, we introduce Mental Imagery for Robotic Affordances (MIRA), an action reasoning framework that optimizes actions with novel-view synthesis and affordance prediction in the loop. Given a set of 2D RGB images, MIRA builds a consistent 3D scene representation, through which we synthesize novel orthographic views amenable to pixel-wise affordances prediction for action optimization. We illustrate how this optimization process enables us to generalize to unseen out-of-plane rotations for 6-DoF robotic manipulation tasks given a limited number of demonstrations, paving the way toward machines that autonomously learn to understand the world around them for planning actions.} }
Endnote
%0 Conference Paper %T MIRA: Mental Imagery for Robotic Affordances %A Yen-Chen Lin %A Pete Florence %A Andy Zeng %A Jonathan T. Barron %A Yilun Du %A Wei-Chiu Ma %A Anthony Simeonov %A Alberto Rodriguez Garcia %A Phillip Isola %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-lin23c %I PMLR %P 1916--1927 %U https://proceedings.mlr.press/v205/lin23c.html %V 205 %X Humans form mental images of 3D scenes to support counterfactual imagination, planning, and motor control. Our abilities to predict the appearance and affordance of the scene from previously unobserved viewpoints aid us in performing manipulation tasks (e.g., 6-DoF kitting) with a level of ease that is currently out of reach for existing robot learning frameworks. In this work, we aim to build artificial systems that can analogously plan actions on top of imagined images. To this end, we introduce Mental Imagery for Robotic Affordances (MIRA), an action reasoning framework that optimizes actions with novel-view synthesis and affordance prediction in the loop. Given a set of 2D RGB images, MIRA builds a consistent 3D scene representation, through which we synthesize novel orthographic views amenable to pixel-wise affordances prediction for action optimization. We illustrate how this optimization process enables us to generalize to unseen out-of-plane rotations for 6-DoF robotic manipulation tasks given a limited number of demonstrations, paving the way toward machines that autonomously learn to understand the world around them for planning actions.
APA
Lin, Y., Florence, P., Zeng, A., Barron, J.T., Du, Y., Ma, W., Simeonov, A., Garcia, A.R. & Isola, P.. (2023). MIRA: Mental Imagery for Robotic Affordances. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:1916-1927 Available from https://proceedings.mlr.press/v205/lin23c.html.

Related Material