Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi; Archit Sharma; Honglak Lee; Sergey Levine; Shixiang Shane Gu

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi, Archit Sharma, Honglak Lee, Sergey Levine, Shixiang Shane Gu

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1953-1963, 2021.

Abstract

Learning to reach goal states and learning diverse skills through mutual information maximization have been proposed as principled frameworks for unsupervised reinforcement learning, allowing agents to acquire broadly applicable multi-task policies with minimal reward engineering. In this paper, we discuss how these two approaches {—} goal-conditioned RL (GCRL) and MI-based RL {—} can be generalized into a single family of methods, interpreting mutual information maximization and variational empowerment as representation learning methods that acquire function-ally aware state representations for goal reaching.Starting from a simple observation that the standard GCRL is encapsulated by the optimization objective of variational empowerment, we can derive novel variants of GCRL and variational empowerment under a single, unified optimization objective, such as adaptive-variance GCRL and linear-mapping GCRL, and study the characteristics of representation learning each variant provides. Furthermore, through the lens of GCRL, we show that adapting powerful techniques fromGCRL such as goal relabeling into the variationalMI context as well as proper regularization on the variational posterior provides substantial gains in algorithm performance, and propose a novel evaluation metric named latent goal reaching (LGR)as an objective measure for evaluating empowerment algorithms akin to goal-based RL. Through principled mathematical derivations and careful experimental validations, our work lays a novel foundation from which representation learning can be evaluated and analyzed in goal-based RL

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-choi21b,
  title = 	 {Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning},
  author =       {Choi, Jongwook and Sharma, Archit and Lee, Honglak and Levine, Sergey and Gu, Shixiang Shane},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {1953--1963},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/choi21b/choi21b.pdf},
  url = 	 {https://proceedings.mlr.press/v139/choi21b.html},
  abstract = 	 {Learning to reach goal states and learning diverse skills through mutual information maximization have been proposed as principled frameworks for unsupervised reinforcement learning, allowing agents to acquire broadly applicable multi-task policies with minimal reward engineering. In this paper, we discuss how these two approaches {—} goal-conditioned RL (GCRL) and MI-based RL {—} can be generalized into a single family of methods, interpreting mutual information maximization and variational empowerment as representation learning methods that acquire function-ally aware state representations for goal reaching.Starting from a simple observation that the standard GCRL is encapsulated by the optimization objective of variational empowerment, we can derive novel variants of GCRL and variational empowerment under a single, unified optimization objective, such as adaptive-variance GCRL and linear-mapping GCRL, and study the characteristics of representation learning each variant provides. Furthermore, through the lens of GCRL, we show that adapting powerful techniques fromGCRL such as goal relabeling into the variationalMI context as well as proper regularization on the variational posterior provides substantial gains in algorithm performance, and propose a novel evaluation metric named latent goal reaching (LGR)as an objective measure for evaluating empowerment algorithms akin to goal-based RL. Through principled mathematical derivations and careful experimental validations, our work lays a novel foundation from which representation learning can be evaluated and analyzed in goal-based RL}
}

Endnote

%0 Conference Paper
%T Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning
%A Jongwook Choi
%A Archit Sharma
%A Honglak Lee
%A Sergey Levine
%A Shixiang Shane Gu
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-choi21b
%I PMLR
%P 1953--1963
%U https://proceedings.mlr.press/v139/choi21b.html
%V 139
%X Learning to reach goal states and learning diverse skills through mutual information maximization have been proposed as principled frameworks for unsupervised reinforcement learning, allowing agents to acquire broadly applicable multi-task policies with minimal reward engineering. In this paper, we discuss how these two approaches {—} goal-conditioned RL (GCRL) and MI-based RL {—} can be generalized into a single family of methods, interpreting mutual information maximization and variational empowerment as representation learning methods that acquire function-ally aware state representations for goal reaching.Starting from a simple observation that the standard GCRL is encapsulated by the optimization objective of variational empowerment, we can derive novel variants of GCRL and variational empowerment under a single, unified optimization objective, such as adaptive-variance GCRL and linear-mapping GCRL, and study the characteristics of representation learning each variant provides. Furthermore, through the lens of GCRL, we show that adapting powerful techniques fromGCRL such as goal relabeling into the variationalMI context as well as proper regularization on the variational posterior provides substantial gains in algorithm performance, and propose a novel evaluation metric named latent goal reaching (LGR)as an objective measure for evaluating empowerment algorithms akin to goal-based RL. Through principled mathematical derivations and careful experimental validations, our work lays a novel foundation from which representation learning can be evaluated and analyzed in goal-based RL

APA


Choi, J., Sharma, A., Lee, H., Levine, S. & Gu, S.S.. (2021). Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1953-1963 Available from https://proceedings.mlr.press/v139/choi21b.html.

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Abstract

Cite this Paper

Related Material