[edit]
Constrained Contrastive Reinforcement Learning
Proceedings of The 14th Asian Conference on Machine
Learning, PMLR 189:1070-1084, 2023.
Abstract
Learning to control from complex observations
remains a major challenge in the application of
model-based reinforcement learning (MBRL). Existing
MBRL methods apply contrastive learning to replace
pixel-level reconstruction, improving the
performance of the latent world model. However,
previous contrastive learning approaches in MBRL
fail to utilize task-relevant information, making it
difficult to aggregate observations with the same
task-relevant information but the different
task-irrelevant information in latent space. In this
work, we first propose Constrained Contrastive
Reinforcement Learning (C2RL), an MBRL method that
learns a world model through a combination of two
contrastive losses based on latent dynamics and
task-relevant state abstraction respectively,
utilizing reward information to accelerate model
learning. Then, we propose a hyperparameter $\beta$
to balance two kinds of contrastive losses to
strengthen the representation ability of the latent
dynamics. The experimental results show that our
approach outperforms state-of-the-art methods in
both the natural video and standard background
setting on challenging DMControl tasks.