Neural Modular Control for Embodied Question Answering

Abhishek Das; Georgia Gkioxari; Stefan Lee; Devi Parikh; Dhruv Batra

Neural Modular Control for Embodied Question Answering

Abhishek Das, Georgia Gkioxari, Stefan Lee, Devi Parikh, Dhruv Batra

Proceedings of The 2nd Conference on Robot Learning, PMLR 87:53-62, 2018.

Abstract

We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. ‘exit room’, ‘find kitchen’, ‘find refrigerator’, etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies. On the challenging EQA [1] benchmark in House3D [2], requiring navigating diverse realistic indoor environments, our approach outperforms prior work by a significant margin, both in terms of navigation and question answering.

Cite this Paper

BibTeX


@InProceedings{pmlr-v87-das18a,
  title = 	 {Neural Modular Control for Embodied Question Answering},
  author =       {Das, Abhishek and Gkioxari, Georgia and Lee, Stefan and Parikh, Devi and Batra, Dhruv},
  booktitle = 	 {Proceedings of The 2nd Conference on Robot Learning},
  pages = 	 {53--62},
  year = 	 {2018},
  editor = 	 {Billard, Aude and Dragan, Anca and Peters, Jan and Morimoto, Jun},
  volume = 	 {87},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29--31 Oct},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v87/das18a/das18a.pdf},
  url = 	 {https://proceedings.mlr.press/v87/das18a.html},
  abstract = 	 {We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. ‘exit room’, ‘find kitchen’, ‘find refrigerator’, etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies. On the challenging EQA [1] benchmark in House3D [2], requiring navigating diverse realistic indoor environments, our approach outperforms prior work by a significant margin, both in terms of navigation and question answering.}
}

Endnote

%0 Conference Paper
%T Neural Modular Control for Embodied Question Answering
%A Abhishek Das
%A Georgia Gkioxari
%A Stefan Lee
%A Devi Parikh
%A Dhruv Batra
%B Proceedings of The 2nd Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Aude Billard
%E Anca Dragan
%E Jan Peters
%E Jun Morimoto	
%F pmlr-v87-das18a
%I PMLR
%P 53--62
%U https://proceedings.mlr.press/v87/das18a.html
%V 87
%X We present a modular approach for learning policies for navigation over long planning horizons from language input. Our hierarchical policy operates at multiple timescales, where the higher-level master policy proposes subgoals to be executed by specialized sub-policies. Our choice of subgoals is compositional and semantic, i.e. they can be sequentially combined in arbitrary orderings, and assume human-interpretable descriptions (e.g. ‘exit room’, ‘find kitchen’, ‘find refrigerator’, etc.). We use imitation learning to warm-start policies at each level of the hierarchy, dramatically increasing sample efficiency, followed by reinforcement learning. Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies. On the challenging EQA [1] benchmark in House3D [2], requiring navigating diverse realistic indoor environments, our approach outperforms prior work by a significant margin, both in terms of navigation and question answering.

APA


Das, A., Gkioxari, G., Lee, S., Parikh, D. & Batra, D.. (2018). Neural Modular Control for Embodied Question Answering. Proceedings of The 2nd Conference on Robot Learning, in Proceedings of Machine Learning Research 87:53-62 Available from https://proceedings.mlr.press/v87/das18a.html.

Neural Modular Control for Embodied Question Answering

Abstract

Cite this Paper

Related Material