SHADOW: Leveraging Segmentation Masks for Cross-Embodiment Policy Transfer

Marion Lepert, Ria Doshi, Jeannette Bohg
Proceedings of The 8th Conference on Robot Learning, PMLR 270:3536-3550, 2025.

Abstract

Data collection in robotics is spread across diverse hardware, and this variation will increase as new hardware is developed. Effective use of this growing body of data requires methods capable of learning from diverse robot embodiments. We consider the setting of training a policy using expert trajectories from a single robot arm (the source), and evaluating on a different robot arm for which no data was collected (the target). We present a data editing scheme termed Shadow, in which the robot during training and evaluation is replaced with a composite segmentation mask of the source and target robots. In this way, the input data distribution at train and test time match closely, enabling robust policy transfer to the new unseen robot while being far more data efficient than approaches that require co-training on large amounts of data from diverse embodiments. We demonstrate that an approach as simple as Shadow is effective both in simulation on varying tasks and robots, and on real robot hardware, where Shadow demonstrates over 2x improvement in success rate compared to the strongest baseline.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-lepert25a, title = {SHADOW: Leveraging Segmentation Masks for Cross-Embodiment Policy Transfer}, author = {Lepert, Marion and Doshi, Ria and Bohg, Jeannette}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {3536--3550}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/lepert25a/lepert25a.pdf}, url = {https://proceedings.mlr.press/v270/lepert25a.html}, abstract = {Data collection in robotics is spread across diverse hardware, and this variation will increase as new hardware is developed. Effective use of this growing body of data requires methods capable of learning from diverse robot embodiments. We consider the setting of training a policy using expert trajectories from a single robot arm (the source), and evaluating on a different robot arm for which no data was collected (the target). We present a data editing scheme termed Shadow, in which the robot during training and evaluation is replaced with a composite segmentation mask of the source and target robots. In this way, the input data distribution at train and test time match closely, enabling robust policy transfer to the new unseen robot while being far more data efficient than approaches that require co-training on large amounts of data from diverse embodiments. We demonstrate that an approach as simple as Shadow is effective both in simulation on varying tasks and robots, and on real robot hardware, where Shadow demonstrates over 2x improvement in success rate compared to the strongest baseline.} }
Endnote
%0 Conference Paper %T SHADOW: Leveraging Segmentation Masks for Cross-Embodiment Policy Transfer %A Marion Lepert %A Ria Doshi %A Jeannette Bohg %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-lepert25a %I PMLR %P 3536--3550 %U https://proceedings.mlr.press/v270/lepert25a.html %V 270 %X Data collection in robotics is spread across diverse hardware, and this variation will increase as new hardware is developed. Effective use of this growing body of data requires methods capable of learning from diverse robot embodiments. We consider the setting of training a policy using expert trajectories from a single robot arm (the source), and evaluating on a different robot arm for which no data was collected (the target). We present a data editing scheme termed Shadow, in which the robot during training and evaluation is replaced with a composite segmentation mask of the source and target robots. In this way, the input data distribution at train and test time match closely, enabling robust policy transfer to the new unseen robot while being far more data efficient than approaches that require co-training on large amounts of data from diverse embodiments. We demonstrate that an approach as simple as Shadow is effective both in simulation on varying tasks and robots, and on real robot hardware, where Shadow demonstrates over 2x improvement in success rate compared to the strongest baseline.
APA
Lepert, M., Doshi, R. & Bohg, J.. (2025). SHADOW: Leveraging Segmentation Masks for Cross-Embodiment Policy Transfer. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:3536-3550 Available from https://proceedings.mlr.press/v270/lepert25a.html.

Related Material