RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools

Haochen Shi, Huazhe Xu, Samuel Clarke, Yunzhu Li, Jiajun Wu
Proceedings of The 7th Conference on Robot Learning, PMLR 229:642-660, 2023.

Abstract

Humans excel in complex long-horizon soft body manipulation tasks via flexible tool use: bread baking requires a knife to slice the dough and a rolling pin to flatten it. Often regarded as a hallmark of human cognition, tool use in autonomous robots remains limited due to challenges in understanding tool-object interactions. Here we develop an intelligent robotic system, RoboCook, which perceives, models, and manipulates elasto-plastic objects with various tools. RoboCook uses point cloud scene representations, models tool-object interactions with Graph Neural Networks (GNNs), and combines tool classification with self-supervised policy learning to devise manipulation plans. We demonstrate that from just 20 minutes of real-world interaction data per tool, a general-purpose robot arm can learn complex long-horizon soft object manipulation tasks, such as making dumplings and alphabet letter cookies. Extensive evaluations show that RoboCook substantially outperforms state-of-the-art approaches, exhibits robustness against severe external disturbances, and demonstrates adaptability to different materials.

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-shi23a, title = {RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools}, author = {Shi, Haochen and Xu, Huazhe and Clarke, Samuel and Li, Yunzhu and Wu, Jiajun}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {642--660}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/shi23a/shi23a.pdf}, url = {https://proceedings.mlr.press/v229/shi23a.html}, abstract = {Humans excel in complex long-horizon soft body manipulation tasks via flexible tool use: bread baking requires a knife to slice the dough and a rolling pin to flatten it. Often regarded as a hallmark of human cognition, tool use in autonomous robots remains limited due to challenges in understanding tool-object interactions. Here we develop an intelligent robotic system, RoboCook, which perceives, models, and manipulates elasto-plastic objects with various tools. RoboCook uses point cloud scene representations, models tool-object interactions with Graph Neural Networks (GNNs), and combines tool classification with self-supervised policy learning to devise manipulation plans. We demonstrate that from just 20 minutes of real-world interaction data per tool, a general-purpose robot arm can learn complex long-horizon soft object manipulation tasks, such as making dumplings and alphabet letter cookies. Extensive evaluations show that RoboCook substantially outperforms state-of-the-art approaches, exhibits robustness against severe external disturbances, and demonstrates adaptability to different materials.} }
Endnote
%0 Conference Paper %T RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools %A Haochen Shi %A Huazhe Xu %A Samuel Clarke %A Yunzhu Li %A Jiajun Wu %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-shi23a %I PMLR %P 642--660 %U https://proceedings.mlr.press/v229/shi23a.html %V 229 %X Humans excel in complex long-horizon soft body manipulation tasks via flexible tool use: bread baking requires a knife to slice the dough and a rolling pin to flatten it. Often regarded as a hallmark of human cognition, tool use in autonomous robots remains limited due to challenges in understanding tool-object interactions. Here we develop an intelligent robotic system, RoboCook, which perceives, models, and manipulates elasto-plastic objects with various tools. RoboCook uses point cloud scene representations, models tool-object interactions with Graph Neural Networks (GNNs), and combines tool classification with self-supervised policy learning to devise manipulation plans. We demonstrate that from just 20 minutes of real-world interaction data per tool, a general-purpose robot arm can learn complex long-horizon soft object manipulation tasks, such as making dumplings and alphabet letter cookies. Extensive evaluations show that RoboCook substantially outperforms state-of-the-art approaches, exhibits robustness against severe external disturbances, and demonstrates adaptability to different materials.
APA
Shi, H., Xu, H., Clarke, S., Li, Y. & Wu, J.. (2023). RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:642-660 Available from https://proceedings.mlr.press/v229/shi23a.html.

Related Material