Leveraging Procedural Generation to Benchmark Reinforcement Learning

Karl Cobbe, Chris Hesse, Jacob Hilton, John Schulman
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:2048-2056, 2020.

Abstract

We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like environments designed to benchmark both sample efficiency and generalization in reinforcement learning. We believe that the community will benefit from increased access to high quality training environments, and we provide detailed experimental protocols for using this benchmark. We empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation. We then use this benchmark to investigate the effects of scaling model size, finding that larger models significantly improve both sample efficiency and generalization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-cobbe20a, title = {Leveraging Procedural Generation to Benchmark Reinforcement Learning}, author = {Cobbe, Karl and Hesse, Chris and Hilton, Jacob and Schulman, John}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {2048--2056}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/cobbe20a/cobbe20a.pdf}, url = {https://proceedings.mlr.press/v119/cobbe20a.html}, abstract = {We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like environments designed to benchmark both sample efficiency and generalization in reinforcement learning. We believe that the community will benefit from increased access to high quality training environments, and we provide detailed experimental protocols for using this benchmark. We empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation. We then use this benchmark to investigate the effects of scaling model size, finding that larger models significantly improve both sample efficiency and generalization.} }
Endnote
%0 Conference Paper %T Leveraging Procedural Generation to Benchmark Reinforcement Learning %A Karl Cobbe %A Chris Hesse %A Jacob Hilton %A John Schulman %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-cobbe20a %I PMLR %P 2048--2056 %U https://proceedings.mlr.press/v119/cobbe20a.html %V 119 %X We introduce Procgen Benchmark, a suite of 16 procedurally generated game-like environments designed to benchmark both sample efficiency and generalization in reinforcement learning. We believe that the community will benefit from increased access to high quality training environments, and we provide detailed experimental protocols for using this benchmark. We empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation. We then use this benchmark to investigate the effects of scaling model size, finding that larger models significantly improve both sample efficiency and generalization.
APA
Cobbe, K., Hesse, C., Hilton, J. & Schulman, J.. (2020). Leveraging Procedural Generation to Benchmark Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:2048-2056 Available from https://proceedings.mlr.press/v119/cobbe20a.html.

Related Material