Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions

Zhenyu Jiang, Yuqi Xie, Jinhan Li, Ye Yuan, Yifeng Zhu, Yuke Zhu
Proceedings of The 8th Conference on Robot Learning, PMLR 270:3015-3026, 2025.

Abstract

Humanoid robots, with their human-like embodiment, have the potential to integrate seamlessly into human environments. Critical to their coexistence and cooperation with humans is the ability to understand natural language communications and exhibit human-like behaviors. This work focuses on generating diverse whole-body motions for humanoid robots from language descriptions. We leverage human motion priors from extensive human motion datasets to initialize humanoid motions and employ the commonsense reasoning capabilities of Vision Language Models (VLMs) to edit and refine these motions. Our approach demonstrates the capability to produce natural, expressive, and text-aligned humanoid motions, validated through both simulated and real-world experiments. More videos can be found on our website https://ut-austin-rpl.github.io/Harmon/.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-jiang25b, title = {Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions}, author = {Jiang, Zhenyu and Xie, Yuqi and Li, Jinhan and Yuan, Ye and Zhu, Yifeng and Zhu, Yuke}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {3015--3026}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/jiang25b/jiang25b.pdf}, url = {https://proceedings.mlr.press/v270/jiang25b.html}, abstract = {Humanoid robots, with their human-like embodiment, have the potential to integrate seamlessly into human environments. Critical to their coexistence and cooperation with humans is the ability to understand natural language communications and exhibit human-like behaviors. This work focuses on generating diverse whole-body motions for humanoid robots from language descriptions. We leverage human motion priors from extensive human motion datasets to initialize humanoid motions and employ the commonsense reasoning capabilities of Vision Language Models (VLMs) to edit and refine these motions. Our approach demonstrates the capability to produce natural, expressive, and text-aligned humanoid motions, validated through both simulated and real-world experiments. More videos can be found on our website https://ut-austin-rpl.github.io/Harmon/.} }
Endnote
%0 Conference Paper %T Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions %A Zhenyu Jiang %A Yuqi Xie %A Jinhan Li %A Ye Yuan %A Yifeng Zhu %A Yuke Zhu %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-jiang25b %I PMLR %P 3015--3026 %U https://proceedings.mlr.press/v270/jiang25b.html %V 270 %X Humanoid robots, with their human-like embodiment, have the potential to integrate seamlessly into human environments. Critical to their coexistence and cooperation with humans is the ability to understand natural language communications and exhibit human-like behaviors. This work focuses on generating diverse whole-body motions for humanoid robots from language descriptions. We leverage human motion priors from extensive human motion datasets to initialize humanoid motions and employ the commonsense reasoning capabilities of Vision Language Models (VLMs) to edit and refine these motions. Our approach demonstrates the capability to produce natural, expressive, and text-aligned humanoid motions, validated through both simulated and real-world experiments. More videos can be found on our website https://ut-austin-rpl.github.io/Harmon/.
APA
Jiang, Z., Xie, Y., Li, J., Yuan, Y., Zhu, Y. & Zhu, Y.. (2025). Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:3015-3026 Available from https://proceedings.mlr.press/v270/jiang25b.html.

Related Material