Action-Sufficient State Representation Learning for Control with Structural Constraints

Biwei Huang, Chaochao Lu, Liu Leqi, Jose Miguel Hernandez-Lobato, Clark Glymour, Bernhard Schölkopf, Kun Zhang
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:9260-9279, 2022.

Abstract

Perceived signals in real-world scenarios are usually high-dimensional and noisy, and finding and using their representation that contains essential and sufficient information required by downstream decision-making tasks will help improve computational efficiency and generalization ability in the tasks. In this paper, we focus on partially observable environments and propose to learn a minimal set of state representations that capture sufficient information for decision-making, termed Action-Sufficient state Representations (ASRs). We build a generative environment model for the structural relationships among variables in the system and present a principled way to characterize ASRs based on structural constraints and the goal of maximizing cumulative reward in policy learning. We then develop a structured sequential Variational Auto-Encoder to estimate the environment model and extract ASRs. Our empirical results on CarRacing and VizDoom demonstrate a clear advantage of learning and using ASRs for policy learning. Moreover, the estimated environment model and ASRs allow learning behaviors from imagined outcomes in the compact latent space to improve sample efficiency.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-huang22f, title = {Action-Sufficient State Representation Learning for Control with Structural Constraints}, author = {Huang, Biwei and Lu, Chaochao and Leqi, Liu and Hernandez-Lobato, Jose Miguel and Glymour, Clark and Sch{\"o}lkopf, Bernhard and Zhang, Kun}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {9260--9279}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/huang22f/huang22f.pdf}, url = {https://proceedings.mlr.press/v162/huang22f.html}, abstract = {Perceived signals in real-world scenarios are usually high-dimensional and noisy, and finding and using their representation that contains essential and sufficient information required by downstream decision-making tasks will help improve computational efficiency and generalization ability in the tasks. In this paper, we focus on partially observable environments and propose to learn a minimal set of state representations that capture sufficient information for decision-making, termed Action-Sufficient state Representations (ASRs). We build a generative environment model for the structural relationships among variables in the system and present a principled way to characterize ASRs based on structural constraints and the goal of maximizing cumulative reward in policy learning. We then develop a structured sequential Variational Auto-Encoder to estimate the environment model and extract ASRs. Our empirical results on CarRacing and VizDoom demonstrate a clear advantage of learning and using ASRs for policy learning. Moreover, the estimated environment model and ASRs allow learning behaviors from imagined outcomes in the compact latent space to improve sample efficiency.} }
Endnote
%0 Conference Paper %T Action-Sufficient State Representation Learning for Control with Structural Constraints %A Biwei Huang %A Chaochao Lu %A Liu Leqi %A Jose Miguel Hernandez-Lobato %A Clark Glymour %A Bernhard Schölkopf %A Kun Zhang %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-huang22f %I PMLR %P 9260--9279 %U https://proceedings.mlr.press/v162/huang22f.html %V 162 %X Perceived signals in real-world scenarios are usually high-dimensional and noisy, and finding and using their representation that contains essential and sufficient information required by downstream decision-making tasks will help improve computational efficiency and generalization ability in the tasks. In this paper, we focus on partially observable environments and propose to learn a minimal set of state representations that capture sufficient information for decision-making, termed Action-Sufficient state Representations (ASRs). We build a generative environment model for the structural relationships among variables in the system and present a principled way to characterize ASRs based on structural constraints and the goal of maximizing cumulative reward in policy learning. We then develop a structured sequential Variational Auto-Encoder to estimate the environment model and extract ASRs. Our empirical results on CarRacing and VizDoom demonstrate a clear advantage of learning and using ASRs for policy learning. Moreover, the estimated environment model and ASRs allow learning behaviors from imagined outcomes in the compact latent space to improve sample efficiency.
APA
Huang, B., Lu, C., Leqi, L., Hernandez-Lobato, J.M., Glymour, C., Schölkopf, B. & Zhang, K.. (2022). Action-Sufficient State Representation Learning for Control with Structural Constraints. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:9260-9279 Available from https://proceedings.mlr.press/v162/huang22f.html.

Related Material