Polybot: Training One Policy Across Robots While Embracing Variability

Jonathan Heewon Yang, Dorsa Sadigh, Chelsea Finn
Proceedings of The 7th Conference on Robot Learning, PMLR 229:2955-2974, 2023.

Abstract

Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets. However, robotic platforms possess varying control schemes, camera viewpoints, kinematic configurations, and end-effector morphologies, posing significant challenges when transferring manipulation skills from one platform to another. To tackle this problem, we propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms. Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras and a unified, but modular codebase. To bridge the remaining domain shift, we align our policy’s internal representations across embodiments via contrastive learning. We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3 robots with varying joint configurations and sizes: the WidowX 250S, Franka Emika Panda, and Sawyer. Our results demonstrate significant improvements in success rate and sample efficiency for our policy when using new task data collected on a different robot, validating our proposed design decisions. More details and videos can be found on our project website: https://sites.google.com/view/cradle-multirobot

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-yang23c, title = {Polybot: Training One Policy Across Robots While Embracing Variability}, author = {Yang, Jonathan Heewon and Sadigh, Dorsa and Finn, Chelsea}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {2955--2974}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/yang23c/yang23c.pdf}, url = {https://proceedings.mlr.press/v229/yang23c.html}, abstract = {Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets. However, robotic platforms possess varying control schemes, camera viewpoints, kinematic configurations, and end-effector morphologies, posing significant challenges when transferring manipulation skills from one platform to another. To tackle this problem, we propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms. Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras and a unified, but modular codebase. To bridge the remaining domain shift, we align our policy’s internal representations across embodiments via contrastive learning. We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3 robots with varying joint configurations and sizes: the WidowX 250S, Franka Emika Panda, and Sawyer. Our results demonstrate significant improvements in success rate and sample efficiency for our policy when using new task data collected on a different robot, validating our proposed design decisions. More details and videos can be found on our project website: https://sites.google.com/view/cradle-multirobot} }
Endnote
%0 Conference Paper %T Polybot: Training One Policy Across Robots While Embracing Variability %A Jonathan Heewon Yang %A Dorsa Sadigh %A Chelsea Finn %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-yang23c %I PMLR %P 2955--2974 %U https://proceedings.mlr.press/v229/yang23c.html %V 229 %X Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets. However, robotic platforms possess varying control schemes, camera viewpoints, kinematic configurations, and end-effector morphologies, posing significant challenges when transferring manipulation skills from one platform to another. To tackle this problem, we propose a set of key design decisions to train a single policy for deployment on multiple robotic platforms. Our framework first aligns the observation and action spaces of our policy across embodiments via utilizing wrist cameras and a unified, but modular codebase. To bridge the remaining domain shift, we align our policy’s internal representations across embodiments via contrastive learning. We evaluate our method on a dataset collected over 60 hours spanning 6 tasks and 3 robots with varying joint configurations and sizes: the WidowX 250S, Franka Emika Panda, and Sawyer. Our results demonstrate significant improvements in success rate and sample efficiency for our policy when using new task data collected on a different robot, validating our proposed design decisions. More details and videos can be found on our project website: https://sites.google.com/view/cradle-multirobot
APA
Yang, J.H., Sadigh, D. & Finn, C.. (2023). Polybot: Training One Policy Across Robots While Embracing Variability. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:2955-2974 Available from https://proceedings.mlr.press/v229/yang23c.html.

Related Material