Contrastive Variational Reinforcement Learning for Complex Observations

Xiao Ma; SIWEI CHEN; David Hsu; Wee Sun Lee

Contrastive Variational Reinforcement Learning for Complex Observations

Xiao Ma, SIWEI CHEN, David Hsu, Wee Sun Lee

Proceedings of the 2020 Conference on Robot Learning, PMLR 155:959-972, 2021.

Abstract

Deep reinforcement learning (DRL) has achieved significant success in various robot tasks: manipulation, navigation, etc. However, complex visual observations in natural environments remains a major challenge. This paper presents Contrastive Variational Reinforcement Learning (CVRL), a model-based method that tackles complex visual observations in DRL. CVRL learns a contrastive variational model by maximizing the mutual information between latent states and observations discriminatively, through contrastive learning. It avoids modeling the complex observation space unnecessarily, as the commonly used generative observation model often does, and is significantly more robust. CVRL achieves comparable performance with state-of-the-art model-based DRL methods on standard Mujoco tasks. It significantly outperforms them on Natural Mujoco tasks and a robot box-pushing task with complex observations, e.g., dynamic shadows. The CVRL code is available publicly at https://github.com/Yusufma03/CVRL.

Cite this Paper

BibTeX


@InProceedings{pmlr-v155-ma21a,
  title = 	 {Contrastive Variational Reinforcement Learning for Complex Observations},
  author =       {Ma, Xiao and CHEN, SIWEI and Hsu, David and Lee, Wee Sun},
  booktitle = 	 {Proceedings of the 2020 Conference on Robot Learning},
  pages = 	 {959--972},
  year = 	 {2021},
  editor = 	 {Kober, Jens and Ramos, Fabio and Tomlin, Claire},
  volume = 	 {155},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--18 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v155/ma21a/ma21a.pdf},
  url = 	 {https://proceedings.mlr.press/v155/ma21a.html},
  abstract = 	 {Deep reinforcement learning (DRL) has achieved significant success in various robot tasks: manipulation, navigation, etc. However, complex visual observations in natural environments remains a major challenge. This paper presents Contrastive Variational Reinforcement Learning (CVRL), a model-based method that tackles complex visual observations in  DRL.  CVRL learns a contrastive variational model by maximizing the mutual information between latent states and observations discriminatively, through contrastive learning. It avoids modeling the complex observation space unnecessarily, as the commonly used generative observation model often does,  and is significantly more robust. CVRL achieves comparable performance with state-of-the-art model-based DRL methods on standard Mujoco tasks. It significantly outperforms them on Natural Mujoco tasks and a robot box-pushing task with complex observations, e.g., dynamic shadows. The CVRL code is available publicly at https://github.com/Yusufma03/CVRL.}
}

Endnote

%0 Conference Paper
%T Contrastive Variational Reinforcement Learning for Complex Observations
%A Xiao Ma
%A SIWEI CHEN
%A David Hsu
%A Wee Sun Lee
%B Proceedings of the 2020 Conference on Robot Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Jens Kober
%E Fabio Ramos
%E Claire Tomlin	
%F pmlr-v155-ma21a
%I PMLR
%P 959--972
%U https://proceedings.mlr.press/v155/ma21a.html
%V 155
%X Deep reinforcement learning (DRL) has achieved significant success in various robot tasks: manipulation, navigation, etc. However, complex visual observations in natural environments remains a major challenge. This paper presents Contrastive Variational Reinforcement Learning (CVRL), a model-based method that tackles complex visual observations in  DRL.  CVRL learns a contrastive variational model by maximizing the mutual information between latent states and observations discriminatively, through contrastive learning. It avoids modeling the complex observation space unnecessarily, as the commonly used generative observation model often does,  and is significantly more robust. CVRL achieves comparable performance with state-of-the-art model-based DRL methods on standard Mujoco tasks. It significantly outperforms them on Natural Mujoco tasks and a robot box-pushing task with complex observations, e.g., dynamic shadows. The CVRL code is available publicly at https://github.com/Yusufma03/CVRL.

APA


Ma, X., CHEN, S., Hsu, D. & Lee, W.S.. (2021). Contrastive Variational Reinforcement Learning for Complex Observations. Proceedings of the 2020 Conference on Robot Learning, in Proceedings of Machine Learning Research 155:959-972 Available from https://proceedings.mlr.press/v155/ma21a.html.

Related Material

Download PDF