Reinforcement Learning with Videos: Combining Offline Observations with Interaction

Karl Schmeckpeper, Oleh Rybkin, Kostas Daniilidis, Sergey Levine, Chelsea Finn
Proceedings of the 2020 Conference on Robot Learning, PMLR 155:339-354, 2021.

Abstract

Reinforcement learning is a powerful framework for robots to acquire skills from experience, but often requires a substantial amount of online data collection. As a result, it is difficult to collect sufficiently diverse experiences that are needed for robots to generalize broadly. Videos of humans, on the other hand, are a readily available source of broad and interesting experiences. In this paper, we consider the question: can we perform reinforcement learning directly on experience collected by humans? This problem is particularly difficult, as such videos are not annotated with actions and exhibit substantial visual domain shift relative to the robot’s embodiment. To address these challenges, we propose a framework for reinforcement learning with videos (RLV). RLV learns a policy and value function using experience collected by humans in combination with data collected by robots. In our experiments, we find that RLV is able to leverage such videos to learn challenging vision-based skills with less than half as many samples as RL methods that learn from scratch.

Cite this Paper


BibTeX
@InProceedings{pmlr-v155-schmeckpeper21a, title = {Reinforcement Learning with Videos: Combining Offline Observations with Interaction}, author = {Schmeckpeper, Karl and Rybkin, Oleh and Daniilidis, Kostas and Levine, Sergey and Finn, Chelsea}, booktitle = {Proceedings of the 2020 Conference on Robot Learning}, pages = {339--354}, year = {2021}, editor = {Kober, Jens and Ramos, Fabio and Tomlin, Claire}, volume = {155}, series = {Proceedings of Machine Learning Research}, month = {16--18 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v155/schmeckpeper21a/schmeckpeper21a.pdf}, url = {https://proceedings.mlr.press/v155/schmeckpeper21a.html}, abstract = { Reinforcement learning is a powerful framework for robots to acquire skills from experience, but often requires a substantial amount of online data collection. As a result, it is difficult to collect sufficiently diverse experiences that are needed for robots to generalize broadly. Videos of humans, on the other hand, are a readily available source of broad and interesting experiences. In this paper, we consider the question: can we perform reinforcement learning directly on experience collected by humans? This problem is particularly difficult, as such videos are not annotated with actions and exhibit substantial visual domain shift relative to the robot’s embodiment. To address these challenges, we propose a framework for reinforcement learning with videos (RLV). RLV learns a policy and value function using experience collected by humans in combination with data collected by robots. In our experiments, we find that RLV is able to leverage such videos to learn challenging vision-based skills with less than half as many samples as RL methods that learn from scratch.} }
Endnote
%0 Conference Paper %T Reinforcement Learning with Videos: Combining Offline Observations with Interaction %A Karl Schmeckpeper %A Oleh Rybkin %A Kostas Daniilidis %A Sergey Levine %A Chelsea Finn %B Proceedings of the 2020 Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2021 %E Jens Kober %E Fabio Ramos %E Claire Tomlin %F pmlr-v155-schmeckpeper21a %I PMLR %P 339--354 %U https://proceedings.mlr.press/v155/schmeckpeper21a.html %V 155 %X Reinforcement learning is a powerful framework for robots to acquire skills from experience, but often requires a substantial amount of online data collection. As a result, it is difficult to collect sufficiently diverse experiences that are needed for robots to generalize broadly. Videos of humans, on the other hand, are a readily available source of broad and interesting experiences. In this paper, we consider the question: can we perform reinforcement learning directly on experience collected by humans? This problem is particularly difficult, as such videos are not annotated with actions and exhibit substantial visual domain shift relative to the robot’s embodiment. To address these challenges, we propose a framework for reinforcement learning with videos (RLV). RLV learns a policy and value function using experience collected by humans in combination with data collected by robots. In our experiments, we find that RLV is able to leverage such videos to learn challenging vision-based skills with less than half as many samples as RL methods that learn from scratch.
APA
Schmeckpeper, K., Rybkin, O., Daniilidis, K., Levine, S. & Finn, C.. (2021). Reinforcement Learning with Videos: Combining Offline Observations with Interaction. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:339-354 Available from https://proceedings.mlr.press/v155/schmeckpeper21a.html.

Related Material