Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach

Minting Pan; Yitao Zheng; Jiajian Li; Yunbo Wang; Xiaokang Yang

Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach

Minting Pan, Yitao Zheng, Jiajian Li, Yunbo Wang, Xiaokang Yang

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:47738-47754, 2025.

Abstract

Offline reinforcement learning (RL) enables policy optimization using static datasets, avoiding the risks and costs of extensive real-world exploration. However, it struggles with suboptimal offline behaviors and inaccurate value estimation due to the lack of environmental interaction. We present Video-Enhanced Offline RL (VeoRL), a model-based method that constructs an interactive world model from diverse, unlabeled video data readily available online. Leveraging model-based behavior guidance, our approach transfers commonsense knowledge of control policy and physical dynamics from natural videos to the RL agent within the target domain. VeoRL achieves substantial performance gains (over 100% in some cases) across visual control tasks in robotic manipulation, autonomous driving, and open-world video games. Project page: https://panmt.github.io/VeoRL.github.io.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-pan25h,
  title = 	 {Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach},
  author =       {Pan, Minting and Zheng, Yitao and Li, Jiajian and Wang, Yunbo and Yang, Xiaokang},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {47738--47754},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/pan25h/pan25h.pdf},
  url = 	 {https://proceedings.mlr.press/v267/pan25h.html},
  abstract = 	 {Offline reinforcement learning (RL) enables policy optimization using static datasets, avoiding the risks and costs of extensive real-world exploration. However, it struggles with suboptimal offline behaviors and inaccurate value estimation due to the lack of environmental interaction. We present Video-Enhanced Offline RL (VeoRL), a model-based method that constructs an interactive world model from diverse, unlabeled video data readily available online. Leveraging model-based behavior guidance, our approach transfers commonsense knowledge of control policy and physical dynamics from natural videos to the RL agent within the target domain. VeoRL achieves substantial performance gains (over 100% in some cases) across visual control tasks in robotic manipulation, autonomous driving, and open-world video games. Project page: https://panmt.github.io/VeoRL.github.io.}
}

Endnote

%0 Conference Paper
%T Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach
%A Minting Pan
%A Yitao Zheng
%A Jiajian Li
%A Yunbo Wang
%A Xiaokang Yang
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-pan25h
%I PMLR
%P 47738--47754
%U https://proceedings.mlr.press/v267/pan25h.html
%V 267
%X Offline reinforcement learning (RL) enables policy optimization using static datasets, avoiding the risks and costs of extensive real-world exploration. However, it struggles with suboptimal offline behaviors and inaccurate value estimation due to the lack of environmental interaction. We present Video-Enhanced Offline RL (VeoRL), a model-based method that constructs an interactive world model from diverse, unlabeled video data readily available online. Leveraging model-based behavior guidance, our approach transfers commonsense knowledge of control policy and physical dynamics from natural videos to the RL agent within the target domain. VeoRL achieves substantial performance gains (over 100% in some cases) across visual control tasks in robotic manipulation, autonomous driving, and open-world video games. Project page: https://panmt.github.io/VeoRL.github.io.

APA

Pan, M., Zheng, Y., Li, J., Wang, Y. & Yang, X.. (2025). Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:47738-47754 Available from https://proceedings.mlr.press/v267/pan25h.html.

Video-Enhanced Offline Reinforcement Learning: A Model-Based Approach

Abstract

Cite this Paper

Related Material