Constrained Contrastive Reinforcement Learning

Haoyu Wang; Xinrui Yang; Yuhang Wang; Lan Xuguang

Constrained Contrastive Reinforcement Learning

Haoyu Wang, Xinrui Yang, Yuhang Wang, Lan Xuguang

Proceedings of The 14th Asian Conference on Machine Learning, PMLR 189:1070-1084, 2023.

Abstract

Learning to control from complex observations remains a major challenge in the application of model-based reinforcement learning (MBRL). Existing MBRL methods apply contrastive learning to replace pixel-level reconstruction, improving the performance of the latent world model. However, previous contrastive learning approaches in MBRL fail to utilize task-relevant information, making it difficult to aggregate observations with the same task-relevant information but the different task-irrelevant information in latent space. In this work, we first propose Constrained Contrastive Reinforcement Learning (C2RL), an MBRL method that learns a world model through a combination of two contrastive losses based on latent dynamics and task-relevant state abstraction respectively, utilizing reward information to accelerate model learning. Then, we propose a hyperparameter

$\beta$ to balance two kinds of contrastive losses to strengthen the representation ability of the latent dynamics. The experimental results show that our approach outperforms state-of-the-art methods in both the natural video and standard background setting on challenging DMControl tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v189-wang23a,
  title = 	 {Constrained Contrastive Reinforcement Learning},
  author =       {Wang, Haoyu and Yang, Xinrui and Wang, Yuhang and Xuguang, Lan},
  booktitle = 	 {Proceedings of The 14th Asian Conference on Machine
 Learning},
  pages = 	 {1070--1084},
  year = 	 {2023},
  editor = 	 {Khan, Emtiyaz and Gonen, Mehmet},
  volume = 	 {189},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {12--14 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v189/wang23a/wang23a.pdf},
  url = 	 {https://proceedings.mlr.press/v189/wang23a.html},
  abstract = 	 {Learning to control from complex observations
 remains a major challenge in the application of
 model-based reinforcement learning (MBRL). Existing
 MBRL methods apply contrastive learning to replace
 pixel-level reconstruction, improving the
 performance of the latent world model. However,
 previous contrastive learning approaches in MBRL
 fail to utilize task-relevant information, making it
 difficult to aggregate observations with the same
 task-relevant information but the different
 task-irrelevant information in latent space. In this
 work, we first propose Constrained Contrastive
 Reinforcement Learning (C2RL), an MBRL method that
 learns a world model through a combination of two
 contrastive losses based on latent dynamics and
 task-relevant state abstraction respectively,
 utilizing reward information to accelerate model
 learning. Then, we propose a hyperparameter $\beta$
 to balance two kinds of contrastive losses to
 strengthen the representation ability of the latent
 dynamics. The experimental results show that our
 approach outperforms state-of-the-art methods in
 both the natural video and standard background
 setting on challenging DMControl tasks.}
}

Endnote

%0 Conference Paper
%T Constrained Contrastive Reinforcement Learning
%A Haoyu Wang
%A Xinrui Yang
%A Yuhang Wang
%A Lan Xuguang
%B Proceedings of The 14th Asian Conference on Machine
 Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Emtiyaz Khan
%E Mehmet Gonen	
%F pmlr-v189-wang23a
%I PMLR
%P 1070--1084
%U https://proceedings.mlr.press/v189/wang23a.html
%V 189
%X Learning to control from complex observations
 remains a major challenge in the application of
 model-based reinforcement learning (MBRL). Existing
 MBRL methods apply contrastive learning to replace
 pixel-level reconstruction, improving the
 performance of the latent world model. However,
 previous contrastive learning approaches in MBRL
 fail to utilize task-relevant information, making it
 difficult to aggregate observations with the same
 task-relevant information but the different
 task-irrelevant information in latent space. In this
 work, we first propose Constrained Contrastive
 Reinforcement Learning (C2RL), an MBRL method that
 learns a world model through a combination of two
 contrastive losses based on latent dynamics and
 task-relevant state abstraction respectively,
 utilizing reward information to accelerate model
 learning. Then, we propose a hyperparameter $\beta$
 to balance two kinds of contrastive losses to
 strengthen the representation ability of the latent
 dynamics. The experimental results show that our
 approach outperforms state-of-the-art methods in
 both the natural video and standard background
 setting on challenging DMControl tasks.

APA


Wang, H., Yang, X., Wang, Y. & Xuguang, L.. (2023). Constrained Contrastive Reinforcement Learning. Proceedings of The 14th Asian Conference on Machine
 Learning, in Proceedings of Machine Learning Research 189:1070-1084 Available from https://proceedings.mlr.press/v189/wang23a.html.

Related Material

Download PDF