A Game Theoretic Framework for Model Based Reinforcement Learning

Aravind Rajeswaran, Igor Mordatch, Vikash Kumar
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:7953-7963, 2020.

Abstract

Designing stable and efficient algorithms for model-based reinforcement learning (MBRL) with function approximation has remained challenging despite growing interest in the field. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develop a new framework that casts MBRL as a game between: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player. We show that a near-optimal policy for the environment can be obtained by finding an approximate equilibrium for aforementioned game, and we develop two families of algorithms to find the game equilibrium by drawing upon ideas from Stackelberg games. Experimental studies suggest that the proposed algorithms achieve state of the art sample efficiency, match the asymptotic performance of model-free policy gradient, and scale gracefully to high-dimensional tasks like dexterous hand manipulation. Project page: \url{https://sites.google.com/view/mbrl-game}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-rajeswaran20a, title = {A Game Theoretic Framework for Model Based Reinforcement Learning}, author = {Rajeswaran, Aravind and Mordatch, Igor and Kumar, Vikash}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {7953--7963}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/rajeswaran20a/rajeswaran20a.pdf}, url = {https://proceedings.mlr.press/v119/rajeswaran20a.html}, abstract = {Designing stable and efficient algorithms for model-based reinforcement learning (MBRL) with function approximation has remained challenging despite growing interest in the field. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develop a new framework that casts MBRL as a game between: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player. We show that a near-optimal policy for the environment can be obtained by finding an approximate equilibrium for aforementioned game, and we develop two families of algorithms to find the game equilibrium by drawing upon ideas from Stackelberg games. Experimental studies suggest that the proposed algorithms achieve state of the art sample efficiency, match the asymptotic performance of model-free policy gradient, and scale gracefully to high-dimensional tasks like dexterous hand manipulation. Project page: \url{https://sites.google.com/view/mbrl-game}.} }
Endnote
%0 Conference Paper %T A Game Theoretic Framework for Model Based Reinforcement Learning %A Aravind Rajeswaran %A Igor Mordatch %A Vikash Kumar %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-rajeswaran20a %I PMLR %P 7953--7963 %U https://proceedings.mlr.press/v119/rajeswaran20a.html %V 119 %X Designing stable and efficient algorithms for model-based reinforcement learning (MBRL) with function approximation has remained challenging despite growing interest in the field. To help expose the practical challenges in MBRL and simplify algorithm design from the lens of abstraction, we develop a new framework that casts MBRL as a game between: (1) a policy player, which attempts to maximize rewards under the learned model; (2) a model player, which attempts to fit the real-world data collected by the policy player. We show that a near-optimal policy for the environment can be obtained by finding an approximate equilibrium for aforementioned game, and we develop two families of algorithms to find the game equilibrium by drawing upon ideas from Stackelberg games. Experimental studies suggest that the proposed algorithms achieve state of the art sample efficiency, match the asymptotic performance of model-free policy gradient, and scale gracefully to high-dimensional tasks like dexterous hand manipulation. Project page: \url{https://sites.google.com/view/mbrl-game}.
APA
Rajeswaran, A., Mordatch, I. & Kumar, V.. (2020). A Game Theoretic Framework for Model Based Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:7953-7963 Available from https://proceedings.mlr.press/v119/rajeswaran20a.html.

Related Material