Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search

Qi Wang; Herke Van Hoof

Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search

Qi Wang, Herke Van Hoof

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:23055-23077, 2022.

Abstract

Reinforcement learning is a promising paradigm for solving sequential decision-making problems, but low data efficiency and weak generalization across tasks are bottlenecks in real-world applications. Model-based meta reinforcement learning addresses these issues by learning dynamics and leveraging knowledge from prior experience. In this paper, we take a closer look at this framework and propose a new posterior sampling based approach that consists of a new model to identify task dynamics together with an amortized policy optimization step. We show that our model, called a graph structured surrogate model (GSSM), achieves competitive dynamics prediction performance with lower model complexity. Moreover, our approach in policy search is able to obtain high returns and allows fast execution by avoiding test-time policy gradient updates.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-wang22z,
  title = 	 {Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search},
  author =       {Wang, Qi and Van Hoof, Herke},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {23055--23077},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/wang22z/wang22z.pdf},
  url = 	 {https://proceedings.mlr.press/v162/wang22z.html},
  abstract = 	 {Reinforcement learning is a promising paradigm for solving sequential decision-making problems, but low data efficiency and weak generalization across tasks are bottlenecks in real-world applications. Model-based meta reinforcement learning addresses these issues by learning dynamics and leveraging knowledge from prior experience. In this paper, we take a closer look at this framework and propose a new posterior sampling based approach that consists of a new model to identify task dynamics together with an amortized policy optimization step. We show that our model, called a graph structured surrogate model (GSSM), achieves competitive dynamics prediction performance with lower model complexity. Moreover, our approach in policy search is able to obtain high returns and allows fast execution by avoiding test-time policy gradient updates.}
}

Endnote

%0 Conference Paper
%T Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search
%A Qi Wang
%A Herke Van Hoof
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-wang22z
%I PMLR
%P 23055--23077
%U https://proceedings.mlr.press/v162/wang22z.html
%V 162
%X Reinforcement learning is a promising paradigm for solving sequential decision-making problems, but low data efficiency and weak generalization across tasks are bottlenecks in real-world applications. Model-based meta reinforcement learning addresses these issues by learning dynamics and leveraging knowledge from prior experience. In this paper, we take a closer look at this framework and propose a new posterior sampling based approach that consists of a new model to identify task dynamics together with an amortized policy optimization step. We show that our model, called a graph structured surrogate model (GSSM), achieves competitive dynamics prediction performance with lower model complexity. Moreover, our approach in policy search is able to obtain high returns and allows fast execution by avoiding test-time policy gradient updates.

APA


Wang, Q. & Van Hoof, H.. (2022). Model-based Meta Reinforcement Learning using Graph Structured Surrogate Models and Amortized Policy Search. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:23055-23077 Available from https://proceedings.mlr.press/v162/wang22z.html.

Related Material

Download PDF