Sim-to-Real Transfer for Vision-and-Language Navigation

Peter Anderson; Ayush Shrivastava; Joanne Truong; Arjun Majumdar; Devi Parikh; Dhruv Batra; Stefan Lee

Sim-to-Real Transfer for Vision-and-Language Navigation

Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, Stefan Lee

Proceedings of the 2020 Conference on Robot Learning, PMLR 155:671-681, 2021.

Abstract

We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot. To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot’s low-level continuous action space, we propose a subgoal model to identify nearby waypoints, and use domain randomization to mitigate visual domain differences. For accurate sim and real comparisons in parallel environments, we annotate a 325m2 office space with 1.3km of navigation instructions, and create a digitized replica in simulation. We find that sim-to-real transfer to an environment not seen in training is successful if an occupancy map and navigation graph can be collected and annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more challenging in the hardest setting with no prior mapping at all (success rate of 22.5%).

Cite this Paper

BibTeX


@InProceedings{pmlr-v155-anderson21a,
  title = 	 {Sim-to-Real Transfer for Vision-and-Language Navigation},
  author =       {Anderson, Peter and Shrivastava, Ayush and Truong, Joanne and Majumdar, Arjun and Parikh, Devi and Batra, Dhruv and Lee, Stefan},
  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
  pages = 	 {671--681},
  year = 	 {2021},
  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
  volume = 	 {155},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v155/anderson21a/anderson21a.pdf},
  url = 	 {https://proceedings.mlr.press/v155/anderson21a.html},
  abstract = 	 {We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot. To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot’s low-level continuous action space, we propose a subgoal model to identify nearby waypoints, and use domain randomization to mitigate visual domain differences. For accurate sim and real comparisons in parallel environments, we annotate a 325m2 office space with 1.3km of navigation instructions, and create a digitized replica in simulation. We find that sim-to-real transfer to an environment not seen in training is successful if an occupancy map and navigation graph can be collected and annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more challenging in the hardest setting with no prior mapping at all (success rate of 22.5%).}
}

Endnote

%0 Conference Paper
%T Sim-to-Real Transfer for Vision-and-Language Navigation
%A Peter Anderson
%A Ayush Shrivastava
%A Joanne Truong
%A Arjun Majumdar
%A Devi Parikh
%A Dhruv Batra
%A Stefan Lee
%B Proceedings of the 2020 Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Jens Kober
%E Fabio Ramos
%E Claire Tomlin	
%F pmlr-v155-anderson21a
%I PMLR
%P 671--681
%U https://proceedings.mlr.press/v155/anderson21a.html
%V 155
%X We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot. To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot’s low-level continuous action space, we propose a subgoal model to identify nearby waypoints, and use domain randomization to mitigate visual domain differences. For accurate sim and real comparisons in parallel environments, we annotate a 325m2 office space with 1.3km of navigation instructions, and create a digitized replica in simulation. We find that sim-to-real transfer to an environment not seen in training is successful if an occupancy map and navigation graph can be collected and annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more challenging in the hardest setting with no prior mapping at all (success rate of 22.5%).

APA


Anderson, P., Shrivastava, A., Truong, J., Majumdar, A., Parikh, D., Batra, D. & Lee, S.. (2021). Sim-to-Real Transfer for Vision-and-Language Navigation. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:671-681 Available from https://proceedings.mlr.press/v155/anderson21a.html.

Sim-to-Real Transfer for Vision-and-Language Navigation

Abstract

Cite this Paper

Related Material