Bigger, Better, Faster: Human-level Atari with human-level efficiency

Max Schwarzer, Johan Samir Obando Ceron, Aaron Courville, Marc G Bellemare, Rishabh Agarwal, Pablo Samuel Castro
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:30365-30380, 2023.

Abstract

We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used for value estimation, as well as a number of other design choices that enable this scaling in a sample-efficient manner. We conduct extensive analyses of these design choices and provide insights for future work. We end with a discussion about updating the goalposts for sample-efficient RL research on the ALE. We make our code and data publicly available at https://github.com/google-research/google-research/tree/master/bigger_better_faster.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-schwarzer23a, title = {Bigger, Better, Faster: Human-level {A}tari with human-level efficiency}, author = {Schwarzer, Max and Obando Ceron, Johan Samir and Courville, Aaron and Bellemare, Marc G and Agarwal, Rishabh and Castro, Pablo Samuel}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {30365--30380}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/schwarzer23a/schwarzer23a.pdf}, url = {https://proceedings.mlr.press/v202/schwarzer23a.html}, abstract = {We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used for value estimation, as well as a number of other design choices that enable this scaling in a sample-efficient manner. We conduct extensive analyses of these design choices and provide insights for future work. We end with a discussion about updating the goalposts for sample-efficient RL research on the ALE. We make our code and data publicly available at https://github.com/google-research/google-research/tree/master/bigger_better_faster.} }
Endnote
%0 Conference Paper %T Bigger, Better, Faster: Human-level Atari with human-level efficiency %A Max Schwarzer %A Johan Samir Obando Ceron %A Aaron Courville %A Marc G Bellemare %A Rishabh Agarwal %A Pablo Samuel Castro %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-schwarzer23a %I PMLR %P 30365--30380 %U https://proceedings.mlr.press/v202/schwarzer23a.html %V 202 %X We introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used for value estimation, as well as a number of other design choices that enable this scaling in a sample-efficient manner. We conduct extensive analyses of these design choices and provide insights for future work. We end with a discussion about updating the goalposts for sample-efficient RL research on the ALE. We make our code and data publicly available at https://github.com/google-research/google-research/tree/master/bigger_better_faster.
APA
Schwarzer, M., Obando Ceron, J.S., Courville, A., Bellemare, M.G., Agarwal, R. & Castro, P.S.. (2023). Bigger, Better, Faster: Human-level Atari with human-level efficiency. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:30365-30380 Available from https://proceedings.mlr.press/v202/schwarzer23a.html.

Related Material