Towards robust and domain agnostic reinforcement learning competitions: MineRL 2020

William Hebgen Guss; Stephanie Milani; Nicholay Topin; Brandon Houghton; Sharada Mohanty; Andrew Melnik; Augustin Harter; Benoit Buschmaas; Bjarne Jaster; Christoph Berganski; Dennis Heitkamp; Marko Henning; Helge Ritter; Chengjie Wu; Xiaotian Hao; Yiming Lu; Hangyu Mao; Yihuan Mao; Chao Wang; Michal Opanowicz; Anssi Kanervisto; Yanick Schraner; Christian Scheller; Xiren Zhou; Lu Liu; Daichi Nishio; Toi Tsuneda; Karolis Ramanauskas; Gabija Juceviciute

Towards robust and domain agnostic reinforcement learning competitions: MineRL 2020

William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge Ritter, Chengjie Wu, Xiaotian Hao, Yiming Lu, Hangyu Mao, Yihuan Mao, Chao Wang, Michal Opanowicz, Anssi Kanervisto, Yanick Schraner, Christian Scheller, Xiren Zhou, Lu Liu, Daichi Nishio, Toi Tsuneda, Karolis Ramanauskas, Gabija Juceviciute

Proceedings of the NeurIPS 2020 Competition and Demonstration Track, PMLR 133:233-252, 2021.

Abstract

Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field. Despite this, a majority of challenges suffer from the same fundamental problems: participant solutions to the posed challenge are usually domain-specific, biased to maximally exploit compute resources, and not guaranteed to be reproducible. In this paper, we present a new framework of competition design that promotes the development of algorithms that overcome these barriers. We propose four central mechanisms for achieving this end: submission retraining, domain randomization, desemantization through domain obfuscation, and the limitation of competition compute and environment-sample budget. To demonstrate the efficacy of this design, we proposed, organized, and ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. In this work, we describe the organizational outcomes of the competition and show that the resulting participant submissions are reproducible, non-specific to the competition environment, and sample/resource efficient, despite the difficult competition task.

Cite this Paper

BibTeX


@InProceedings{pmlr-v133-guss21a,
  title = 	 {Towards robust and domain agnostic reinforcement learning competitions: MineRL 2020},
  author =       {Guss, William Hebgen and Milani, Stephanie and Topin, Nicholay and Houghton, Brandon and Mohanty, Sharada and Melnik, Andrew and Harter, Augustin and Buschmaas, Benoit and Jaster, Bjarne and Berganski, Christoph and Heitkamp, Dennis and Henning, Marko and Ritter, Helge and Wu, Chengjie and Hao, Xiaotian and Lu, Yiming and Mao, Hangyu and Mao, Yihuan and Wang, Chao and Opanowicz, Michal and Kanervisto, Anssi and Schraner, Yanick and Scheller, Christian and Zhou, Xiren and Liu, Lu and Nishio, Daichi and Tsuneda, Toi and Ramanauskas, Karolis and Juceviciute, Gabija},
  booktitle = 	 {Proceedings of the NeurIPS 2020 Competition and Demonstration Track},
  pages = 	 {233--252},
  year = 	 {2021},
  editor = 	 {Escalante, Hugo Jair and Hofmann, Katja},
  volume = 	 {133},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--12 Dec},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v133/guss21a/guss21a.pdf},
  url = 	 {https://proceedings.mlr.press/v133/guss21a.html},
  abstract = 	 {Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field. Despite this, a majority of challenges suffer from the same fundamental problems: participant solutions to the posed challenge are usually domain-specific, biased to maximally exploit compute resources, and not guaranteed to be reproducible. In this paper, we present a new framework of competition design that promotes the development of algorithms that overcome these barriers. We propose four central mechanisms for achieving this end: submission retraining, domain randomization, desemantization through domain obfuscation, and the limitation of competition compute and environment-sample budget. To demonstrate the efficacy of this design, we proposed, organized, and ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. In this work, we describe the organizational outcomes of the competition and show that the resulting participant submissions are reproducible, non-specific to the competition environment, and sample/resource efficient, despite the difficult competition task.}
}

Endnote

%0 Conference Paper
%T Towards robust and domain agnostic reinforcement learning competitions: MineRL 2020
%A William Hebgen Guss
%A Stephanie Milani
%A Nicholay Topin
%A Brandon Houghton
%A Sharada Mohanty
%A Andrew Melnik
%A Augustin Harter
%A Benoit Buschmaas
%A Bjarne Jaster
%A Christoph Berganski
%A Dennis Heitkamp
%A Marko Henning
%A Helge Ritter
%A Chengjie Wu
%A Xiaotian Hao
%A Yiming Lu
%A Hangyu Mao
%A Yihuan Mao
%A Chao Wang
%A Michal Opanowicz
%A Anssi Kanervisto
%A Yanick Schraner
%A Christian Scheller
%A Xiren Zhou
%A Lu Liu
%A Daichi Nishio
%A Toi Tsuneda
%A Karolis Ramanauskas
%A Gabija Juceviciute
%B Proceedings of the NeurIPS 2020 Competition and Demonstration Track
%C Proceedings of Machine Learning Research
%D 2021
%E Hugo Jair Escalante
%E Katja Hofmann	
%F pmlr-v133-guss21a
%I PMLR
%P 233--252
%U https://proceedings.mlr.press/v133/guss21a.html
%V 133
%X Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field. Despite this, a majority of challenges suffer from the same fundamental problems: participant solutions to the posed challenge are usually domain-specific, biased to maximally exploit compute resources, and not guaranteed to be reproducible. In this paper, we present a new framework of competition design that promotes the development of algorithms that overcome these barriers. We propose four central mechanisms for achieving this end: submission retraining, domain randomization, desemantization through domain obfuscation, and the limitation of competition compute and environment-sample budget. To demonstrate the efficacy of this design, we proposed, organized, and ran the MineRL 2020 Competition on Sample-Efficient Reinforcement Learning. In this work, we describe the organizational outcomes of the competition and show that the resulting participant submissions are reproducible, non-specific to the competition environment, and sample/resource efficient, despite the difficult competition task.

APA


Guss, W.H., Milani, S., Topin, N., Houghton, B., Mohanty, S., Melnik, A., Harter, A., Buschmaas, B., Jaster, B., Berganski, C., Heitkamp, D., Henning, M., Ritter, H., Wu, C., Hao, X., Lu, Y., Mao, H., Mao, Y., Wang, C., Opanowicz, M., Kanervisto, A., Schraner, Y., Scheller, C., Zhou, X., Liu, L., Nishio, D., Tsuneda, T., Ramanauskas, K. & Juceviciute, G.. (2021). Towards robust and domain agnostic reinforcement learning competitions: MineRL 2020. Proceedings of the NeurIPS 2020 Competition and Demonstration Track, in Proceedings of Machine Learning Research 133:233-252 Available from https://proceedings.mlr.press/v133/guss21a.html.

Related Material

Download PDF