Learning Long-Horizon Robot Manipulation Skills via Privileged Action

Xiaofeng Mao, Yucheng XU, Zhaole Sun, Elle Miller, Daniel Layeghi, Michael Mistry
Proceedings of The 9th Conference on Robot Learning, PMLR 305:1063-1078, 2025.

Abstract

Long-horizon contact-rich tasks are challenging to learn with reinforcement learning, due to ineffective exploration of high-dimensional state spaces with sparse rewards. The learning process often gets stuck in local optimum and demands task-specific reward fine-tuning for complex scenarios. In this work, we propose a structured framework that leverages privileged actions with curriculum learning, enabling the policy to efficiently acquire long-horizon skills without relying on extensive reward engineering or reference trajectories. Specifically, we use privileged actions in simulation with a general training procedure that would be infeasible to implement in real-world scenarios. These privileges include relaxed constraints and virtual forces that enhance interaction and exploration with objects. Our results successfully achieve complex multi-stage long-horizon tasks that naturally combine non-prehensile manipulation with grasping to lift objects from non-graspable poses. We demonstrate generality by maintaining a parsimonious reward structure and showing convergence to diverse and robust behaviors across various environments. Our approach outperforms state-of-the-art methods in these tasks, converging to solutions where others fail.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-mao25a, title = {Learning Long-Horizon Robot Manipulation Skills via Privileged Action}, author = {Mao, Xiaofeng and XU, Yucheng and Sun, Zhaole and Miller, Elle and Layeghi, Daniel and Mistry, Michael}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {1063--1078}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/mao25a/mao25a.pdf}, url = {https://proceedings.mlr.press/v305/mao25a.html}, abstract = {Long-horizon contact-rich tasks are challenging to learn with reinforcement learning, due to ineffective exploration of high-dimensional state spaces with sparse rewards. The learning process often gets stuck in local optimum and demands task-specific reward fine-tuning for complex scenarios. In this work, we propose a structured framework that leverages privileged actions with curriculum learning, enabling the policy to efficiently acquire long-horizon skills without relying on extensive reward engineering or reference trajectories. Specifically, we use privileged actions in simulation with a general training procedure that would be infeasible to implement in real-world scenarios. These privileges include relaxed constraints and virtual forces that enhance interaction and exploration with objects. Our results successfully achieve complex multi-stage long-horizon tasks that naturally combine non-prehensile manipulation with grasping to lift objects from non-graspable poses. We demonstrate generality by maintaining a parsimonious reward structure and showing convergence to diverse and robust behaviors across various environments. Our approach outperforms state-of-the-art methods in these tasks, converging to solutions where others fail.} }
Endnote
%0 Conference Paper %T Learning Long-Horizon Robot Manipulation Skills via Privileged Action %A Xiaofeng Mao %A Yucheng XU %A Zhaole Sun %A Elle Miller %A Daniel Layeghi %A Michael Mistry %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-mao25a %I PMLR %P 1063--1078 %U https://proceedings.mlr.press/v305/mao25a.html %V 305 %X Long-horizon contact-rich tasks are challenging to learn with reinforcement learning, due to ineffective exploration of high-dimensional state spaces with sparse rewards. The learning process often gets stuck in local optimum and demands task-specific reward fine-tuning for complex scenarios. In this work, we propose a structured framework that leverages privileged actions with curriculum learning, enabling the policy to efficiently acquire long-horizon skills without relying on extensive reward engineering or reference trajectories. Specifically, we use privileged actions in simulation with a general training procedure that would be infeasible to implement in real-world scenarios. These privileges include relaxed constraints and virtual forces that enhance interaction and exploration with objects. Our results successfully achieve complex multi-stage long-horizon tasks that naturally combine non-prehensile manipulation with grasping to lift objects from non-graspable poses. We demonstrate generality by maintaining a parsimonious reward structure and showing convergence to diverse and robust behaviors across various environments. Our approach outperforms state-of-the-art methods in these tasks, converging to solutions where others fail.
APA
Mao, X., XU, Y., Sun, Z., Miller, E., Layeghi, D. & Mistry, M.. (2025). Learning Long-Horizon Robot Manipulation Skills via Privileged Action. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:1063-1078 Available from https://proceedings.mlr.press/v305/mao25a.html.

Related Material