Hold My Beer: Learning Gentle Humanoid Locomotion and End-Effector Stabilization Control

Yitang Li, Yuanhang Zhang, Wenli Xiao, Chaoyi Pan, Haoyang Weng, Guanqi He, Tairan He, Guanya Shi
Proceedings of The 9th Conference on Robot Learning, PMLR 305:4506-4523, 2025.

Abstract

Can your humanoid walk up and hand you a full cup of beer—without spilling a drop? While humanoids are increasingly featured in flashy demos—dancing, delivering packages, traversing rough terrain—fine-grained control during locomotion remains a significant challenge. In particular, stabilizing a filled end-effector (EE) while walking is far from solved, due to a fundamental mismatch in task dynamics: locomotion demands slow-timescale, robust control, whereas EE stabilization requires rapid, high-precision corrections. To address this, we propose SoFTA, a Slow-Fast Two-Agent framework that decouples upper-body and lower-body control into separate agents operating at different frequencies and with distinct rewards. This temporal and objective separation mitigates policy interference mitagates objective conflict and enables coordinated whole-body behavior. SoFTA executes upper-body actions at 100 Hz for precise EE control and lower-body actions at 50 Hz for robust gait. It reduces EE acceleration by 2-5x to baselines and performs 2–3x closer to human-level stability, enabling delicate tasks such as carrying nearly full cups, capturing steady video during locomotion, and disturbance rejection with EE stability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-li25i, title = {Hold My Beer: Learning Gentle Humanoid Locomotion and End-Effector Stabilization Control}, author = {Li, Yitang and Zhang, Yuanhang and Xiao, Wenli and Pan, Chaoyi and Weng, Haoyang and He, Guanqi and He, Tairan and Shi, Guanya}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {4506--4523}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/li25i/li25i.pdf}, url = {https://proceedings.mlr.press/v305/li25i.html}, abstract = {Can your humanoid walk up and hand you a full cup of beer—without spilling a drop? While humanoids are increasingly featured in flashy demos—dancing, delivering packages, traversing rough terrain—fine-grained control during locomotion remains a significant challenge. In particular, stabilizing a filled end-effector (EE) while walking is far from solved, due to a fundamental mismatch in task dynamics: locomotion demands slow-timescale, robust control, whereas EE stabilization requires rapid, high-precision corrections. To address this, we propose SoFTA, a Slow-Fast Two-Agent framework that decouples upper-body and lower-body control into separate agents operating at different frequencies and with distinct rewards. This temporal and objective separation mitigates policy interference mitagates objective conflict and enables coordinated whole-body behavior. SoFTA executes upper-body actions at 100 Hz for precise EE control and lower-body actions at 50 Hz for robust gait. It reduces EE acceleration by 2-5x to baselines and performs 2–3x closer to human-level stability, enabling delicate tasks such as carrying nearly full cups, capturing steady video during locomotion, and disturbance rejection with EE stability.} }
Endnote
%0 Conference Paper %T Hold My Beer: Learning Gentle Humanoid Locomotion and End-Effector Stabilization Control %A Yitang Li %A Yuanhang Zhang %A Wenli Xiao %A Chaoyi Pan %A Haoyang Weng %A Guanqi He %A Tairan He %A Guanya Shi %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-li25i %I PMLR %P 4506--4523 %U https://proceedings.mlr.press/v305/li25i.html %V 305 %X Can your humanoid walk up and hand you a full cup of beer—without spilling a drop? While humanoids are increasingly featured in flashy demos—dancing, delivering packages, traversing rough terrain—fine-grained control during locomotion remains a significant challenge. In particular, stabilizing a filled end-effector (EE) while walking is far from solved, due to a fundamental mismatch in task dynamics: locomotion demands slow-timescale, robust control, whereas EE stabilization requires rapid, high-precision corrections. To address this, we propose SoFTA, a Slow-Fast Two-Agent framework that decouples upper-body and lower-body control into separate agents operating at different frequencies and with distinct rewards. This temporal and objective separation mitigates policy interference mitagates objective conflict and enables coordinated whole-body behavior. SoFTA executes upper-body actions at 100 Hz for precise EE control and lower-body actions at 50 Hz for robust gait. It reduces EE acceleration by 2-5x to baselines and performs 2–3x closer to human-level stability, enabling delicate tasks such as carrying nearly full cups, capturing steady video during locomotion, and disturbance rejection with EE stability.
APA
Li, Y., Zhang, Y., Xiao, W., Pan, C., Weng, H., He, G., He, T. & Shi, G.. (2025). Hold My Beer: Learning Gentle Humanoid Locomotion and End-Effector Stabilization Control. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:4506-4523 Available from https://proceedings.mlr.press/v305/li25i.html.

Related Material