Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

Austin W. Hanjie; Victor Y Zhong; Karthik Narasimhan

Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

Austin W. Hanjie, Victor Y Zhong, Karthik Narasimhan

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:4051-4062, 2021.

Abstract

We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics. Unlike previous work, Messenger does not assume prior knowledge connecting text and state observations {—} the control policy must simultaneously ground the game manual to entity symbols and dynamics in the environment. We develop a new model, EMMA (Entity Mapper with Multi-modal Attention) which uses an entity-conditioned attention module that allows for selective focus over relevant descriptions in the manual for each entity in the environment. EMMA is end-to-end differentiable and learns a latent grounding of entities and dynamics from text to observations using only environment rewards. EMMA achieves successful zero-shot generalization to unseen games with new dynamics, obtaining a 40% higher win rate compared to multiple baselines. However, win rate on the hardest stage of Messenger remains low (10%), demonstrating the need for additional work in this direction.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-hanjie21a,
  title = 	 {Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning},
  author =       {Hanjie, Austin W. and Zhong, Victor Y and Narasimhan, Karthik},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {4051--4062},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/hanjie21a/hanjie21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/hanjie21a.html},
  abstract = 	 {We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics. Unlike previous work, Messenger does not assume prior knowledge connecting text and state observations {—} the control policy must simultaneously ground the game manual to entity symbols and dynamics in the environment. We develop a new model, EMMA (Entity Mapper with Multi-modal Attention) which uses an entity-conditioned attention module that allows for selective focus over relevant descriptions in the manual for each entity in the environment. EMMA is end-to-end differentiable and learns a latent grounding of entities and dynamics from text to observations using only environment rewards. EMMA achieves successful zero-shot generalization to unseen games with new dynamics, obtaining a 40% higher win rate compared to multiple baselines. However, win rate on the hardest stage of Messenger remains low (10%), demonstrating the need for additional work in this direction.}
}

Endnote

%0 Conference Paper
%T Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
%A Austin W. Hanjie
%A Victor Y Zhong
%A Karthik Narasimhan
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-hanjie21a
%I PMLR
%P 4051--4062
%U https://proceedings.mlr.press/v139/hanjie21a.html
%V 139
%X We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics. Unlike previous work, Messenger does not assume prior knowledge connecting text and state observations {—} the control policy must simultaneously ground the game manual to entity symbols and dynamics in the environment. We develop a new model, EMMA (Entity Mapper with Multi-modal Attention) which uses an entity-conditioned attention module that allows for selective focus over relevant descriptions in the manual for each entity in the environment. EMMA is end-to-end differentiable and learns a latent grounding of entities and dynamics from text to observations using only environment rewards. EMMA achieves successful zero-shot generalization to unseen games with new dynamics, obtaining a 40% higher win rate compared to multiple baselines. However, win rate on the hardest stage of Messenger remains low (10%), demonstrating the need for additional work in this direction.

APA

Hanjie, A.W., Zhong, V.Y. & Narasimhan, K.. (2021). Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:4051-4062 Available from https://proceedings.mlr.press/v139/hanjie21a.html.

Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

Abstract

Cite this Paper

Related Material