LILA: Language-Informed Latent Actions

Siddharth Karamcheti; Megha Srivastava; Percy Liang; Dorsa Sadigh

LILA: Language-Informed Latent Actions

Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh

Proceedings of the 5th Conference on Robot Learning, PMLR 164:1379-1390, 2022.

Abstract

We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration. LILA falls under the shared autonomy paradigm: in addition to providing discrete language inputs, humans are given a low-dimensional controller – e.g., a 2 degree-of-freedom (DoF) joystick that can move left/right and up/down – for operating the robot. LILA learns to use language to modulate this controller, providing users with a language-informed control space: given an instruction like "place the cereal bowl on the tray," LILA may learn a 2-DoF space where one dimension controls the distance from the robot’s end-effector to the bowl, and the other dimension controls the robot’s end-effector pose relative to the grasp point on the bowl. We evaluate LILA with real-world user studies, where users can provide a language instruction while operating a 7-DoF Franka Emika Panda Arm to complete a series of complex manipulation tasks. We show that LILA models are not only more sample efficient and performant than imitation learning and end-effector control baselines, but that they are also qualitatively preferred by users.

Cite this Paper

BibTeX


@InProceedings{pmlr-v164-karamcheti22a,
  title = 	 {LILA: Language-Informed Latent Actions},
  author =       {Karamcheti, Siddharth and Srivastava, Megha and Liang, Percy and Sadigh, Dorsa},
  booktitle = 	 {Proceedings of the 5th Conference on Robot Learning},
  pages = 	 {1379--1390},
  year = 	 {2022},
  editor = 	 {Faust, Aleksandra and Hsu, David and Neumann, Gerhard},
  volume = 	 {164},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--11 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v164/karamcheti22a/karamcheti22a.pdf},
  url = 	 {https://proceedings.mlr.press/v164/karamcheti22a.html},
  abstract = 	 {We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration. LILA falls under the shared autonomy paradigm: in addition to providing discrete language inputs, humans are given a low-dimensional controller – e.g., a 2 degree-of-freedom (DoF) joystick that can move left/right and up/down – for operating the robot. LILA learns to use language to modulate this controller, providing users with a language-informed control space: given an instruction like "place the cereal bowl on the tray," LILA may learn a 2-DoF space where one dimension controls the distance from the robot’s end-effector to the bowl, and the other dimension controls the robot’s end-effector pose relative to the grasp point on the bowl. We evaluate LILA with real-world user studies, where users can provide a language instruction while operating a 7-DoF Franka Emika Panda Arm to complete a series of complex manipulation tasks. We show that LILA models are not only more sample efficient and performant than imitation learning and end-effector control baselines, but that they are also qualitatively preferred by users.}
}

Endnote

%0 Conference Paper
%T LILA: Language-Informed Latent Actions
%A Siddharth Karamcheti
%A Megha Srivastava
%A Percy Liang
%A Dorsa Sadigh
%B Proceedings of the 5th Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Aleksandra Faust
%E David Hsu
%E Gerhard Neumann	
%F pmlr-v164-karamcheti22a
%I PMLR
%P 1379--1390
%U https://proceedings.mlr.press/v164/karamcheti22a.html
%V 164
%X We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration. LILA falls under the shared autonomy paradigm: in addition to providing discrete language inputs, humans are given a low-dimensional controller – e.g., a 2 degree-of-freedom (DoF) joystick that can move left/right and up/down – for operating the robot. LILA learns to use language to modulate this controller, providing users with a language-informed control space: given an instruction like "place the cereal bowl on the tray," LILA may learn a 2-DoF space where one dimension controls the distance from the robot’s end-effector to the bowl, and the other dimension controls the robot’s end-effector pose relative to the grasp point on the bowl. We evaluate LILA with real-world user studies, where users can provide a language instruction while operating a 7-DoF Franka Emika Panda Arm to complete a series of complex manipulation tasks. We show that LILA models are not only more sample efficient and performant than imitation learning and end-effector control baselines, but that they are also qualitatively preferred by users.

APA


Karamcheti, S., Srivastava, M., Liang, P. & Sadigh, D.. (2022). LILA: Language-Informed Latent Actions. Proceedings of the 5th Conference on Robot Learning, in Proceedings of Machine Learning Research 164:1379-1390 Available from https://proceedings.mlr.press/v164/karamcheti22a.html.

LILA: Language-Informed Latent Actions

Abstract

Cite this Paper

Related Material