Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition

Huy Ha, Pete Florence, Shuran Song
Proceedings of The 7th Conference on Robot Learning, PMLR 229:3766-3777, 2023.

Abstract

We present a framework for robot skill acquisition, which 1) efficiently scale up data generation of language-labelled robot data and 2) effectively distills this data down into a robust multi-task language-conditioned visuo-motor policy. For (1), we use a large language model (LLM) to guide high-level planning, and sampling-based robot planners (e.g. motion or grasp samplers) for generating diverse and rich manipulation trajectories. To robustify this data-collection process, the LLM also infers a code-snippet for the success condition of each task, simultaneously enabling the data-collection process to detect failure and retry as well as the automatic labeling of trajectories with success/failure. For (2), we extend the diffusion policy single-task behavior-cloning approach to multi-task settings with language conditioning. Finally, we propose a new multi-task benchmark with 18 tasks across five domains to test long-horizon behavior, common-sense reasoning, tool-use, and intuitive physics. We find that our distilled policy successfully learned the robust retrying behavior in its data collection procedure, while improving absolute success rates by 33.2 on average across five domains. Code, data, and additional qualitative results are available on https://www.cs.columbia.edu/ huy/scalingup/.

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-ha23a, title = {Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition}, author = {Ha, Huy and Florence, Pete and Song, Shuran}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {3766--3777}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/ha23a/ha23a.pdf}, url = {https://proceedings.mlr.press/v229/ha23a.html}, abstract = {We present a framework for robot skill acquisition, which 1) efficiently scale up data generation of language-labelled robot data and 2) effectively distills this data down into a robust multi-task language-conditioned visuo-motor policy. For (1), we use a large language model (LLM) to guide high-level planning, and sampling-based robot planners (e.g. motion or grasp samplers) for generating diverse and rich manipulation trajectories. To robustify this data-collection process, the LLM also infers a code-snippet for the success condition of each task, simultaneously enabling the data-collection process to detect failure and retry as well as the automatic labeling of trajectories with success/failure. For (2), we extend the diffusion policy single-task behavior-cloning approach to multi-task settings with language conditioning. Finally, we propose a new multi-task benchmark with 18 tasks across five domains to test long-horizon behavior, common-sense reasoning, tool-use, and intuitive physics. We find that our distilled policy successfully learned the robust retrying behavior in its data collection procedure, while improving absolute success rates by $33.2%$ on average across five domains. Code, data, and additional qualitative results are available on https://www.cs.columbia.edu/ huy/scalingup/.} }
Endnote
%0 Conference Paper %T Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition %A Huy Ha %A Pete Florence %A Shuran Song %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-ha23a %I PMLR %P 3766--3777 %U https://proceedings.mlr.press/v229/ha23a.html %V 229 %X We present a framework for robot skill acquisition, which 1) efficiently scale up data generation of language-labelled robot data and 2) effectively distills this data down into a robust multi-task language-conditioned visuo-motor policy. For (1), we use a large language model (LLM) to guide high-level planning, and sampling-based robot planners (e.g. motion or grasp samplers) for generating diverse and rich manipulation trajectories. To robustify this data-collection process, the LLM also infers a code-snippet for the success condition of each task, simultaneously enabling the data-collection process to detect failure and retry as well as the automatic labeling of trajectories with success/failure. For (2), we extend the diffusion policy single-task behavior-cloning approach to multi-task settings with language conditioning. Finally, we propose a new multi-task benchmark with 18 tasks across five domains to test long-horizon behavior, common-sense reasoning, tool-use, and intuitive physics. We find that our distilled policy successfully learned the robust retrying behavior in its data collection procedure, while improving absolute success rates by $33.2%$ on average across five domains. Code, data, and additional qualitative results are available on https://www.cs.columbia.edu/ huy/scalingup/.
APA
Ha, H., Florence, P. & Song, S.. (2023). Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:3766-3777 Available from https://proceedings.mlr.press/v229/ha23a.html.

Related Material