Conditional Driving from Natural Language Instructions

Junha Roh, Chris Paxton, Andrzej Pronobis, Ali Farhadi, Dieter Fox
Proceedings of the Conference on Robot Learning, PMLR 100:540-551, 2020.

Abstract

Widespread adoption of self-driving cars will depend not only on their safety but largely on their ability to interact with human users. Just like human drivers, self-driving cars will be expected to understand and safely follow natural-language directions that suddenly alter the pre-planned route according to user’s preference or in presence of ambiguities, particularly in locations with poor or outdated map coverage. To this end, we propose a language-grounded driving agent implementing a hierarchical policy using recurrent layers and gated attention. The hierarchical approach enables us to reason both in terms of high-level language instructions describing long time horizons and low-level, complex, continuous state/action spaces required for real-time control of a self-driving car. We train our policy with conditional imitation learning from realistic language data collected from human drivers and navigators. Through quantitative and interactive experiments within the CARLA framework, we show that our model can successfully interpret language instructions and follow them safely, even when generalizing to previously unseen environments. Code and video are available at:https://sites.google.com/view/language-grounded-driving.

Cite this Paper


BibTeX
@InProceedings{pmlr-v100-roh20a, title = {Conditional Driving from Natural Language Instructions}, author = {Roh, Junha and Paxton, Chris and Pronobis, Andrzej and Farhadi, Ali and Fox, Dieter}, booktitle = {Proceedings of the Conference on Robot Learning}, pages = {540--551}, year = {2020}, editor = {Kaelbling, Leslie Pack and Kragic, Danica and Sugiura, Komei}, volume = {100}, series = {Proceedings of Machine Learning Research}, month = {30 Oct--01 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v100/roh20a/roh20a.pdf}, url = {https://proceedings.mlr.press/v100/roh20a.html}, abstract = {Widespread adoption of self-driving cars will depend not only on their safety but largely on their ability to interact with human users. Just like human drivers, self-driving cars will be expected to understand and safely follow natural-language directions that suddenly alter the pre-planned route according to user’s preference or in presence of ambiguities, particularly in locations with poor or outdated map coverage. To this end, we propose a language-grounded driving agent implementing a hierarchical policy using recurrent layers and gated attention. The hierarchical approach enables us to reason both in terms of high-level language instructions describing long time horizons and low-level, complex, continuous state/action spaces required for real-time control of a self-driving car. We train our policy with conditional imitation learning from realistic language data collected from human drivers and navigators. Through quantitative and interactive experiments within the CARLA framework, we show that our model can successfully interpret language instructions and follow them safely, even when generalizing to previously unseen environments. Code and video are available at:https://sites.google.com/view/language-grounded-driving.} }
Endnote
%0 Conference Paper %T Conditional Driving from Natural Language Instructions %A Junha Roh %A Chris Paxton %A Andrzej Pronobis %A Ali Farhadi %A Dieter Fox %B Proceedings of the Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2020 %E Leslie Pack Kaelbling %E Danica Kragic %E Komei Sugiura %F pmlr-v100-roh20a %I PMLR %P 540--551 %U https://proceedings.mlr.press/v100/roh20a.html %V 100 %X Widespread adoption of self-driving cars will depend not only on their safety but largely on their ability to interact with human users. Just like human drivers, self-driving cars will be expected to understand and safely follow natural-language directions that suddenly alter the pre-planned route according to user’s preference or in presence of ambiguities, particularly in locations with poor or outdated map coverage. To this end, we propose a language-grounded driving agent implementing a hierarchical policy using recurrent layers and gated attention. The hierarchical approach enables us to reason both in terms of high-level language instructions describing long time horizons and low-level, complex, continuous state/action spaces required for real-time control of a self-driving car. We train our policy with conditional imitation learning from realistic language data collected from human drivers and navigators. Through quantitative and interactive experiments within the CARLA framework, we show that our model can successfully interpret language instructions and follow them safely, even when generalizing to previously unseen environments. Code and video are available at:https://sites.google.com/view/language-grounded-driving.
APA
Roh, J., Paxton, C., Pronobis, A., Farhadi, A. & Fox, D.. (2020). Conditional Driving from Natural Language Instructions. Proceedings of the Conference on Robot Learning, in Proceedings of Machine Learning Research 100:540-551 Available from https://proceedings.mlr.press/v100/roh20a.html.

Related Material