UMI-on-Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers

Huy Ha, Yihuai Gao, Zipeng Fu, Jie Tan, Shuran Song
Proceedings of The 8th Conference on Robot Learning, PMLR 270:5254-5270, 2025.

Abstract

We introduce UMI-on-Legs, a new framework that combines real-world and simulation data for quadruped manipulation systems. We scale task-centric data collection in the real world using a handheld gripper (UMI), providing a cheap way to demonstrate task-relevant manipulation skills without a robot. Simultaneously, we scale robot-centric data in simulation by training a whole-body controller. The interface between these two policies are end-effector trajectories in the task-frame, which are inferred by the manipulation policy and passed to the whole-body controller for tracking. We evaluate UMI-on-Legs on prehensile, non-prehensile, and dynamic manipulation tasks, and report over 70% success rate for all tasks. Lastly, we also demonstrate the zero-shot cross-embodiment deployment of a pre-trained manipulation policy checkpoint from a prior work, originally intended for a fixed-base robot arm, on our quadruped system. We believe this framework provides a scalable path towards learning expressive manipulation skills on dynamic robot embodiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-ha25a, title = {UMI-on-Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers}, author = {Ha, Huy and Gao, Yihuai and Fu, Zipeng and Tan, Jie and Song, Shuran}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {5254--5270}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/ha25a/ha25a.pdf}, url = {https://proceedings.mlr.press/v270/ha25a.html}, abstract = {We introduce UMI-on-Legs, a new framework that combines real-world and simulation data for quadruped manipulation systems. We scale task-centric data collection in the real world using a handheld gripper (UMI), providing a cheap way to demonstrate task-relevant manipulation skills without a robot. Simultaneously, we scale robot-centric data in simulation by training a whole-body controller. The interface between these two policies are end-effector trajectories in the task-frame, which are inferred by the manipulation policy and passed to the whole-body controller for tracking. We evaluate UMI-on-Legs on prehensile, non-prehensile, and dynamic manipulation tasks, and report over 70% success rate for all tasks. Lastly, we also demonstrate the zero-shot cross-embodiment deployment of a pre-trained manipulation policy checkpoint from a prior work, originally intended for a fixed-base robot arm, on our quadruped system. We believe this framework provides a scalable path towards learning expressive manipulation skills on dynamic robot embodiments.} }
Endnote
%0 Conference Paper %T UMI-on-Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers %A Huy Ha %A Yihuai Gao %A Zipeng Fu %A Jie Tan %A Shuran Song %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-ha25a %I PMLR %P 5254--5270 %U https://proceedings.mlr.press/v270/ha25a.html %V 270 %X We introduce UMI-on-Legs, a new framework that combines real-world and simulation data for quadruped manipulation systems. We scale task-centric data collection in the real world using a handheld gripper (UMI), providing a cheap way to demonstrate task-relevant manipulation skills without a robot. Simultaneously, we scale robot-centric data in simulation by training a whole-body controller. The interface between these two policies are end-effector trajectories in the task-frame, which are inferred by the manipulation policy and passed to the whole-body controller for tracking. We evaluate UMI-on-Legs on prehensile, non-prehensile, and dynamic manipulation tasks, and report over 70% success rate for all tasks. Lastly, we also demonstrate the zero-shot cross-embodiment deployment of a pre-trained manipulation policy checkpoint from a prior work, originally intended for a fixed-base robot arm, on our quadruped system. We believe this framework provides a scalable path towards learning expressive manipulation skills on dynamic robot embodiments.
APA
Ha, H., Gao, Y., Fu, Z., Tan, J. & Song, S.. (2025). UMI-on-Legs: Making Manipulation Policies Mobile with Manipulation-Centric Whole-body Controllers. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:5254-5270 Available from https://proceedings.mlr.press/v270/ha25a.html.

Related Material