Offline Reinforcement Learning for Visual Navigation

Dhruv Shah, Arjun Bhorkar, Hrishit Leen, Ilya Kostrikov, Nicholas Rhinehart, Sergey Levine
Proceedings of The 6th Conference on Robot Learning, PMLR 205:44-54, 2023.

Abstract

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass. However, online learning from trial-and-error for real-world robots is logistically challenging, and methods that instead can utilize existing datasets of robotic navigation data could be significantly more scalable and enable broader generalization. In this paper, we present ReViND, the first offline RL system for robotic navigation that can leverage previously collected data to optimize user-specified reward functions in the real-world. We evaluate our system for off-road navigation without any additional data collection or fine-tuning, and show that it can navigate to distant goals using only offline training from this dataset, and exhibit behaviors that qualitatively differ based on the user-specified reward function.

Cite this Paper


BibTeX
@InProceedings{pmlr-v205-shah23a, title = {Offline Reinforcement Learning for Visual Navigation}, author = {Shah, Dhruv and Bhorkar, Arjun and Leen, Hrishit and Kostrikov, Ilya and Rhinehart, Nicholas and Levine, Sergey}, booktitle = {Proceedings of The 6th Conference on Robot Learning}, pages = {44--54}, year = {2023}, editor = {Liu, Karen and Kulic, Dana and Ichnowski, Jeff}, volume = {205}, series = {Proceedings of Machine Learning Research}, month = {14--18 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v205/shah23a/shah23a.pdf}, url = {https://proceedings.mlr.press/v205/shah23a.html}, abstract = {Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass. However, online learning from trial-and-error for real-world robots is logistically challenging, and methods that instead can utilize existing datasets of robotic navigation data could be significantly more scalable and enable broader generalization. In this paper, we present ReViND, the first offline RL system for robotic navigation that can leverage previously collected data to optimize user-specified reward functions in the real-world. We evaluate our system for off-road navigation without any additional data collection or fine-tuning, and show that it can navigate to distant goals using only offline training from this dataset, and exhibit behaviors that qualitatively differ based on the user-specified reward function.} }
Endnote
%0 Conference Paper %T Offline Reinforcement Learning for Visual Navigation %A Dhruv Shah %A Arjun Bhorkar %A Hrishit Leen %A Ilya Kostrikov %A Nicholas Rhinehart %A Sergey Levine %B Proceedings of The 6th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Karen Liu %E Dana Kulic %E Jeff Ichnowski %F pmlr-v205-shah23a %I PMLR %P 44--54 %U https://proceedings.mlr.press/v205/shah23a.html %V 205 %X Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass. However, online learning from trial-and-error for real-world robots is logistically challenging, and methods that instead can utilize existing datasets of robotic navigation data could be significantly more scalable and enable broader generalization. In this paper, we present ReViND, the first offline RL system for robotic navigation that can leverage previously collected data to optimize user-specified reward functions in the real-world. We evaluate our system for off-road navigation without any additional data collection or fine-tuning, and show that it can navigate to distant goals using only offline training from this dataset, and exhibit behaviors that qualitatively differ based on the user-specified reward function.
APA
Shah, D., Bhorkar, A., Leen, H., Kostrikov, I., Rhinehart, N. & Levine, S.. (2023). Offline Reinforcement Learning for Visual Navigation. Proceedings of The 6th Conference on Robot Learning, in Proceedings of Machine Learning Research 205:44-54 Available from https://proceedings.mlr.press/v205/shah23a.html.

Related Material