Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Yifan Zhou; Shubham Sonawani; Mariano Phielipp; Simon Stepputtis; Heni Amor

Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Yifan Zhou, Shubham Sonawani, Mariano Phielipp, Simon Stepputtis, Heni Amor

Proceedings of The 6th Conference on Robot Learning, PMLR 205:1684-1695, 2023.

Abstract

Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process.

Cite this Paper

BibTeX


@InProceedings{pmlr-v205-zhou23b,
  title = 	 {Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation},
  author =       {Zhou, Yifan and Sonawani, Shubham and Phielipp, Mariano and Stepputtis, Simon and Amor, Heni},
  booktitle = 	 {Proceedings of The 6th Conference on Robot Learning},
  pages = 	 {1684--1695},
  year = 	 {2023},
  editor = 	 {Liu, Karen and Kulic, Dana and Ichnowski, Jeff},
  volume = 	 {205},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {14--18 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v205/zhou23b/zhou23b.pdf},
  url = 	 {https://proceedings.mlr.press/v205/zhou23b.html},
  abstract = 	 {Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process.}
}

Endnote

%0 Conference Paper
%T Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation
%A Yifan Zhou
%A Shubham Sonawani
%A Mariano Phielipp
%A Simon Stepputtis
%A Heni Amor
%B Proceedings of The 6th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Karen Liu
%E Dana Kulic
%E Jeff Ichnowski	
%F pmlr-v205-zhou23b
%I PMLR
%P 1684--1695
%U https://proceedings.mlr.press/v205/zhou23b.html
%V 205
%X Language-conditioned policies allow robots to interpret and execute human instructions. Learning such policies requires a substantial investment with regards to time and compute resources. Still, the resulting controllers are highly device-specific and cannot easily be transferred to a robot with different morphology, capability, appearance or dynamics. In this paper, we propose a sample-efficient approach for training language-conditioned manipulation policies that allows for rapid transfer across different types of robots. By introducing a novel method, namely Hierarchical Modularity, and adopting supervised attention across multiple sub-modules, we bridge the divide between modular and end-to-end learning and enable the reuse of functional building blocks. In both simulated and real world robot manipulation experiments, we demonstrate that our method outperforms the current state-of-the-art methods and can transfer policies across 4 different robots in a sample-efficient manner. Finally, we show that the functionality of learned sub-modules is maintained beyond the training process and can be used to introspect the robot decision-making process.

APA


Zhou, Y., Sonawani, S., Phielipp, M., Stepputtis, S. & Amor, H.. (2023). Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:1684-1695 Available from https://proceedings.mlr.press/v205/zhou23b.html.

Modularity through Attention: Efficient Training and Transfer of Language-Conditioned Policies for Robot Manipulation

Abstract

Cite this Paper

Related Material