Bi-Level Motion Imitation for Humanoid Robots

Wenshuai Zhao, Yi Zhao, Joni Pajarinen, Michael Muehlebach
Proceedings of The 8th Conference on Robot Learning, PMLR 270:963-981, 2025.

Abstract

Imitation learning from human motion capture (MoCap) data provides a promising way to train humanoid robots. However, due to differences in morphology, such as varying degrees of joint freedom and force limits, exact replication of human behaviors may not be feasible for humanoid robots. Consequently, incorporating physically infeasible MoCap data in training datasets can adversely affect the performance of the robot policy. To address this issue, we propose a bi-level optimization-based imitation learning framework that alternates between optimizing both the robot policy and the target MoCap data. Specifically, we first develop a generative latent dynamics model using a novel self-consistent auto-encoder, which learns sparse and structured motion representations while capturing desired motion patterns in the dataset. The dynamics model is then utilized to generate reference motions while the latent representation regularizes the bi-level motion imitation process. Simulations conducted with a realistic model of a humanoid robot demonstrate that our method enhances the robot policy by modifying reference motions to be physically consistent.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-zhao25a, title = {Bi-Level Motion Imitation for Humanoid Robots}, author = {Zhao, Wenshuai and Zhao, Yi and Pajarinen, Joni and Muehlebach, Michael}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {963--981}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/zhao25a/zhao25a.pdf}, url = {https://proceedings.mlr.press/v270/zhao25a.html}, abstract = {Imitation learning from human motion capture (MoCap) data provides a promising way to train humanoid robots. However, due to differences in morphology, such as varying degrees of joint freedom and force limits, exact replication of human behaviors may not be feasible for humanoid robots. Consequently, incorporating physically infeasible MoCap data in training datasets can adversely affect the performance of the robot policy. To address this issue, we propose a bi-level optimization-based imitation learning framework that alternates between optimizing both the robot policy and the target MoCap data. Specifically, we first develop a generative latent dynamics model using a novel self-consistent auto-encoder, which learns sparse and structured motion representations while capturing desired motion patterns in the dataset. The dynamics model is then utilized to generate reference motions while the latent representation regularizes the bi-level motion imitation process. Simulations conducted with a realistic model of a humanoid robot demonstrate that our method enhances the robot policy by modifying reference motions to be physically consistent.} }
Endnote
%0 Conference Paper %T Bi-Level Motion Imitation for Humanoid Robots %A Wenshuai Zhao %A Yi Zhao %A Joni Pajarinen %A Michael Muehlebach %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-zhao25a %I PMLR %P 963--981 %U https://proceedings.mlr.press/v270/zhao25a.html %V 270 %X Imitation learning from human motion capture (MoCap) data provides a promising way to train humanoid robots. However, due to differences in morphology, such as varying degrees of joint freedom and force limits, exact replication of human behaviors may not be feasible for humanoid robots. Consequently, incorporating physically infeasible MoCap data in training datasets can adversely affect the performance of the robot policy. To address this issue, we propose a bi-level optimization-based imitation learning framework that alternates between optimizing both the robot policy and the target MoCap data. Specifically, we first develop a generative latent dynamics model using a novel self-consistent auto-encoder, which learns sparse and structured motion representations while capturing desired motion patterns in the dataset. The dynamics model is then utilized to generate reference motions while the latent representation regularizes the bi-level motion imitation process. Simulations conducted with a realistic model of a humanoid robot demonstrate that our method enhances the robot policy by modifying reference motions to be physically consistent.
APA
Zhao, W., Zhao, Y., Pajarinen, J. & Muehlebach, M.. (2025). Bi-Level Motion Imitation for Humanoid Robots. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:963-981 Available from https://proceedings.mlr.press/v270/zhao25a.html.

Related Material