Multi-critic Learning for Whole-body End-effector Twist Tracking

Aravind Elanjimattathil Vijayan, Andrei Cramariuc, Mattia Risiglione, Christian Gehring, Marco Hutter
Proceedings of The 9th Conference on Robot Learning, PMLR 305:1470-1485, 2025.

Abstract

Learning whole-body control for locomotion and arm motions in a single policy has challenges, as the two tasks have conflicting goals. For instance, efficient locomotion typically favors a horizontal base orientation, while end-effector tracking may benefit from base tilting to extend reachability. Additionally, current Reinforcement Learning (RL) approaches using a pose-based task specification lack the ability to directly control the end-effector velocity, making smoothly executing trajectories very challenging. To address these limitations, we propose an RL-based framework that allows for dynamic, velocity-aware whole-body end-effector control. Our method introduces a multi-critic actor architecture that decouples the reward signals for locomotion and manipulation, simplifying reward tuning and allowing the policy to resolve task conflicts more effectively. Furthermore, we design a twist-based end-effector task formulation that can track both discrete poses and motion trajectories. We validate our approach through a set of simulation and hardware experiments using a quadruped robot equipped with a robotic arm. The resulting controller can simultaneously walk and move its end-effector and shows emergent whole-body behaviors, where the base assists the arm in extending the workspace, despite a lack of explicit formulations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-vijayan25a, title = {Multi-critic Learning for Whole-body End-effector Twist Tracking}, author = {Vijayan, Aravind Elanjimattathil and Cramariuc, Andrei and Risiglione, Mattia and Gehring, Christian and Hutter, Marco}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {1470--1485}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/vijayan25a/vijayan25a.pdf}, url = {https://proceedings.mlr.press/v305/vijayan25a.html}, abstract = {Learning whole-body control for locomotion and arm motions in a single policy has challenges, as the two tasks have conflicting goals. For instance, efficient locomotion typically favors a horizontal base orientation, while end-effector tracking may benefit from base tilting to extend reachability. Additionally, current Reinforcement Learning (RL) approaches using a pose-based task specification lack the ability to directly control the end-effector velocity, making smoothly executing trajectories very challenging. To address these limitations, we propose an RL-based framework that allows for dynamic, velocity-aware whole-body end-effector control. Our method introduces a multi-critic actor architecture that decouples the reward signals for locomotion and manipulation, simplifying reward tuning and allowing the policy to resolve task conflicts more effectively. Furthermore, we design a twist-based end-effector task formulation that can track both discrete poses and motion trajectories. We validate our approach through a set of simulation and hardware experiments using a quadruped robot equipped with a robotic arm. The resulting controller can simultaneously walk and move its end-effector and shows emergent whole-body behaviors, where the base assists the arm in extending the workspace, despite a lack of explicit formulations.} }
Endnote
%0 Conference Paper %T Multi-critic Learning for Whole-body End-effector Twist Tracking %A Aravind Elanjimattathil Vijayan %A Andrei Cramariuc %A Mattia Risiglione %A Christian Gehring %A Marco Hutter %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-vijayan25a %I PMLR %P 1470--1485 %U https://proceedings.mlr.press/v305/vijayan25a.html %V 305 %X Learning whole-body control for locomotion and arm motions in a single policy has challenges, as the two tasks have conflicting goals. For instance, efficient locomotion typically favors a horizontal base orientation, while end-effector tracking may benefit from base tilting to extend reachability. Additionally, current Reinforcement Learning (RL) approaches using a pose-based task specification lack the ability to directly control the end-effector velocity, making smoothly executing trajectories very challenging. To address these limitations, we propose an RL-based framework that allows for dynamic, velocity-aware whole-body end-effector control. Our method introduces a multi-critic actor architecture that decouples the reward signals for locomotion and manipulation, simplifying reward tuning and allowing the policy to resolve task conflicts more effectively. Furthermore, we design a twist-based end-effector task formulation that can track both discrete poses and motion trajectories. We validate our approach through a set of simulation and hardware experiments using a quadruped robot equipped with a robotic arm. The resulting controller can simultaneously walk and move its end-effector and shows emergent whole-body behaviors, where the base assists the arm in extending the workspace, despite a lack of explicit formulations.
APA
Vijayan, A.E., Cramariuc, A., Risiglione, M., Gehring, C. & Hutter, M.. (2025). Multi-critic Learning for Whole-body End-effector Twist Tracking. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:1470-1485 Available from https://proceedings.mlr.press/v305/vijayan25a.html.

Related Material