Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation

Jean Pierre Sleiman, Mayank Mittal, Marco Hutter
Proceedings of The 8th Conference on Robot Learning, PMLR 270:531-546, 2025.

Abstract

Reinforcement learning (RL) has shown remarkable proficiency in developing robust control policies for contact-rich applications. However, it typically requires meticulous Markov Decision Process (MDP) designing tailored to each task and robotic platform. This work addresses this challenge by creating a systematic approach to behavior synthesis and control for multi-contact loco-manipulation. We define a task-independent MDP formulation to learn robust RL policies using a single demonstration (per task) generated from a fast model-based trajectory optimization method. Our framework is validated on diverse real-world tasks, such as navigating spring-loaded doors and manipulating heavy dishwashers. The learned behaviors can handle dynamic uncertainties and external disturbances, showcasing recovery maneuvers, such as re-grasping objects during execution. Finally, we successfully transfer the policies to a real robot, demonstrating the approach’s practical viability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-sleiman25a, title = {Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation}, author = {Sleiman, Jean Pierre and Mittal, Mayank and Hutter, Marco}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {531--546}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/sleiman25a/sleiman25a.pdf}, url = {https://proceedings.mlr.press/v270/sleiman25a.html}, abstract = {Reinforcement learning (RL) has shown remarkable proficiency in developing robust control policies for contact-rich applications. However, it typically requires meticulous Markov Decision Process (MDP) designing tailored to each task and robotic platform. This work addresses this challenge by creating a systematic approach to behavior synthesis and control for multi-contact loco-manipulation. We define a task-independent MDP formulation to learn robust RL policies using a single demonstration (per task) generated from a fast model-based trajectory optimization method. Our framework is validated on diverse real-world tasks, such as navigating spring-loaded doors and manipulating heavy dishwashers. The learned behaviors can handle dynamic uncertainties and external disturbances, showcasing recovery maneuvers, such as re-grasping objects during execution. Finally, we successfully transfer the policies to a real robot, demonstrating the approach’s practical viability.} }
Endnote
%0 Conference Paper %T Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation %A Jean Pierre Sleiman %A Mayank Mittal %A Marco Hutter %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-sleiman25a %I PMLR %P 531--546 %U https://proceedings.mlr.press/v270/sleiman25a.html %V 270 %X Reinforcement learning (RL) has shown remarkable proficiency in developing robust control policies for contact-rich applications. However, it typically requires meticulous Markov Decision Process (MDP) designing tailored to each task and robotic platform. This work addresses this challenge by creating a systematic approach to behavior synthesis and control for multi-contact loco-manipulation. We define a task-independent MDP formulation to learn robust RL policies using a single demonstration (per task) generated from a fast model-based trajectory optimization method. Our framework is validated on diverse real-world tasks, such as navigating spring-loaded doors and manipulating heavy dishwashers. The learned behaviors can handle dynamic uncertainties and external disturbances, showcasing recovery maneuvers, such as re-grasping objects during execution. Finally, we successfully transfer the policies to a real robot, demonstrating the approach’s practical viability.
APA
Sleiman, J.P., Mittal, M. & Hutter, M.. (2025). Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:531-546 Available from https://proceedings.mlr.press/v270/sleiman25a.html.

Related Material