Principled Offline RL in the Presence of Rich Exogenous Information

Riashat Islam; Manan Tomar; Alex Lamb; Yonathan Efroni; Hongyu Zang; Aniket Rajiv Didolkar; Dipendra Misra; Xin Li; Harm Van Seijen; Remi Tachet Des Combes; John Langford

Principled Offline RL in the Presence of Rich Exogenous Information

Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Rajiv Didolkar, Dipendra Misra, Xin Li, Harm Van Seijen, Remi Tachet Des Combes, John Langford

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:14390-14421, 2023.

Abstract

Learning to control an agent from offline data collected in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenous information, i.e., any control-irrelevant information contained in observations. For example, a robot navigating in busy streets needs to ignore irrelevant information, such as other people walking in the background, textures of objects, or birds in the sky. In this paper, we focus on the setting with visually detailed exogenous information and introduce new offline RL benchmarks that offer the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time-dependent process, which is prevalent in practical applications. To address these, we propose to use multi-step inverse models to learn Agent-Centric Representations for Offline-RL (ACRO). Despite being simple and reward-free, we show theoretically and empirically that the representation created by this objective greatly outperforms baselines.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-islam23a,
  title = 	 {Principled Offline {RL} in the Presence of Rich Exogenous Information},
  author =       {Islam, Riashat and Tomar, Manan and Lamb, Alex and Efroni, Yonathan and Zang, Hongyu and Didolkar, Aniket Rajiv and Misra, Dipendra and Li, Xin and Seijen, Harm Van and Tachet Des Combes, Remi and Langford, John},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {14390--14421},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/islam23a/islam23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/islam23a.html},
  abstract = 	 {Learning to control an agent from offline data collected in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenous information, i.e., any control-irrelevant information contained in observations. For example, a robot navigating in busy streets needs to ignore irrelevant information, such as other people walking in the background, textures of objects, or birds in the sky. In this paper, we focus on the setting with visually detailed exogenous information and introduce new offline RL benchmarks that offer the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time-dependent process, which is prevalent in practical applications. To address these, we propose to use multi-step inverse models to learn Agent-Centric Representations for Offline-RL (ACRO). Despite being simple and reward-free, we show theoretically and empirically that the representation created by this objective greatly outperforms baselines.}
}

Endnote

%0 Conference Paper
%T Principled Offline RL in the Presence of Rich Exogenous Information
%A Riashat Islam
%A Manan Tomar
%A Alex Lamb
%A Yonathan Efroni
%A Hongyu Zang
%A Aniket Rajiv Didolkar
%A Dipendra Misra
%A Xin Li
%A Harm Van Seijen
%A Remi Tachet Des Combes
%A John Langford
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-islam23a
%I PMLR
%P 14390--14421
%U https://proceedings.mlr.press/v202/islam23a.html
%V 202
%X Learning to control an agent from offline data collected in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenous information, i.e., any control-irrelevant information contained in observations. For example, a robot navigating in busy streets needs to ignore irrelevant information, such as other people walking in the background, textures of objects, or birds in the sky. In this paper, we focus on the setting with visually detailed exogenous information and introduce new offline RL benchmarks that offer the ability to study this problem. We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time-dependent process, which is prevalent in practical applications. To address these, we propose to use multi-step inverse models to learn Agent-Centric Representations for Offline-RL (ACRO). Despite being simple and reward-free, we show theoretically and empirically that the representation created by this objective greatly outperforms baselines.

APA


Islam, R., Tomar, M., Lamb, A., Efroni, Y., Zang, H., Didolkar, A.R., Misra, D., Li, X., Seijen, H.V., Tachet Des Combes, R. & Langford, J.. (2023). Principled Offline RL in the Presence of Rich Exogenous Information. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:14390-14421 Available from https://proceedings.mlr.press/v202/islam23a.html.

Principled Offline RL in the Presence of Rich Exogenous Information

Abstract

Cite this Paper

Related Material