Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents

Haonan Yu, Xiaochen Lian, Haichao Zhang, Wei Xu
Proceedings of The 2nd Conference on Robot Learning, PMLR 87:81-98, 2018.

Abstract

Recently there has been a rising interest in training agents, embodied in virtual environments, to perform language-directed tasks by deep reinforcement learning. In this paper, we propose a simple but effective neural language grounding module for embodied agents that can be trained end to end from scratch taking raw pixels, unstructured linguistic commands, and sparse rewards as the inputs. We model the language grounding process as a language-guided transformation of visual features, where latent sentence embeddings are used as the transformation matrices. In several language-directed navigation tasks that feature challenging partial observability and require simple reasoning, our module significantly outperforms the state of the art. We also release XWORLD3D, an easy-to-customize 3D environment that can be modified to evaluate a variety of embodied agents.

Cite this Paper


BibTeX
@InProceedings{pmlr-v87-yu18a, title = {Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents}, author = {Yu, Haonan and Lian, Xiaochen and Zhang, Haichao and Xu, Wei}, booktitle = {Proceedings of The 2nd Conference on Robot Learning}, pages = {81--98}, year = {2018}, editor = {Billard, Aude and Dragan, Anca and Peters, Jan and Morimoto, Jun}, volume = {87}, series = {Proceedings of Machine Learning Research}, month = {29--31 Oct}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v87/yu18a/yu18a.pdf}, url = {https://proceedings.mlr.press/v87/yu18a.html}, abstract = {Recently there has been a rising interest in training agents, embodied in virtual environments, to perform language-directed tasks by deep reinforcement learning. In this paper, we propose a simple but effective neural language grounding module for embodied agents that can be trained end to end from scratch taking raw pixels, unstructured linguistic commands, and sparse rewards as the inputs. We model the language grounding process as a language-guided transformation of visual features, where latent sentence embeddings are used as the transformation matrices. In several language-directed navigation tasks that feature challenging partial observability and require simple reasoning, our module significantly outperforms the state of the art. We also release XWORLD3D, an easy-to-customize 3D environment that can be modified to evaluate a variety of embodied agents.} }
Endnote
%0 Conference Paper %T Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents %A Haonan Yu %A Xiaochen Lian %A Haichao Zhang %A Wei Xu %B Proceedings of The 2nd Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2018 %E Aude Billard %E Anca Dragan %E Jan Peters %E Jun Morimoto %F pmlr-v87-yu18a %I PMLR %P 81--98 %U https://proceedings.mlr.press/v87/yu18a.html %V 87 %X Recently there has been a rising interest in training agents, embodied in virtual environments, to perform language-directed tasks by deep reinforcement learning. In this paper, we propose a simple but effective neural language grounding module for embodied agents that can be trained end to end from scratch taking raw pixels, unstructured linguistic commands, and sparse rewards as the inputs. We model the language grounding process as a language-guided transformation of visual features, where latent sentence embeddings are used as the transformation matrices. In several language-directed navigation tasks that feature challenging partial observability and require simple reasoning, our module significantly outperforms the state of the art. We also release XWORLD3D, an easy-to-customize 3D environment that can be modified to evaluate a variety of embodied agents.
APA
Yu, H., Lian, X., Zhang, H. & Xu, W.. (2018). Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents. Proceedings of The 2nd Conference on Robot Learning, in Proceedings of Machine Learning Research 87:81-98 Available from https://proceedings.mlr.press/v87/yu18a.html.

Related Material