Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC

Tyler Clark, Mark Towers, Christine Evers, Jonathon Hare
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:11064-11091, 2025.

Abstract

Rainbow Deep Q-Network (DQN) demonstrated combining multiple independent enhancements could significantly boost a reinforcement learning (RL) agent’s performance. In this paper, we present “Beyond The Rainbow” (BTR), a novel algorithm that integrates six improvements from across the RL literature to Rainbow DQN, establishing a new state-of-the-art for RL using a desktop PC, with a human-normalized interquartile mean (IQM) of 7.6 on Atari-60. Beyond Atari, we demonstrate BTR’s capability to handle complex 3D games, successfully training agents to play Super Mario Galaxy, Mario Kart, and Mortal Kombat with minimal algorithmic changes. Designing BTR with computational efficiency in mind, agents can be trained using a high-end desktop PC on 200 million Atari frames within 12 hours. Additionally, we conduct detailed ablation studies of each component, analyzing the performance and impact using numerous measures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-clark25a, title = {Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop {PC}}, author = {Clark, Tyler and Towers, Mark and Evers, Christine and Hare, Jonathon}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {11064--11091}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/clark25a/clark25a.pdf}, url = {https://proceedings.mlr.press/v267/clark25a.html}, abstract = {Rainbow Deep Q-Network (DQN) demonstrated combining multiple independent enhancements could significantly boost a reinforcement learning (RL) agent’s performance. In this paper, we present “Beyond The Rainbow” (BTR), a novel algorithm that integrates six improvements from across the RL literature to Rainbow DQN, establishing a new state-of-the-art for RL using a desktop PC, with a human-normalized interquartile mean (IQM) of 7.6 on Atari-60. Beyond Atari, we demonstrate BTR’s capability to handle complex 3D games, successfully training agents to play Super Mario Galaxy, Mario Kart, and Mortal Kombat with minimal algorithmic changes. Designing BTR with computational efficiency in mind, agents can be trained using a high-end desktop PC on 200 million Atari frames within 12 hours. Additionally, we conduct detailed ablation studies of each component, analyzing the performance and impact using numerous measures.} }
Endnote
%0 Conference Paper %T Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC %A Tyler Clark %A Mark Towers %A Christine Evers %A Jonathon Hare %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-clark25a %I PMLR %P 11064--11091 %U https://proceedings.mlr.press/v267/clark25a.html %V 267 %X Rainbow Deep Q-Network (DQN) demonstrated combining multiple independent enhancements could significantly boost a reinforcement learning (RL) agent’s performance. In this paper, we present “Beyond The Rainbow” (BTR), a novel algorithm that integrates six improvements from across the RL literature to Rainbow DQN, establishing a new state-of-the-art for RL using a desktop PC, with a human-normalized interquartile mean (IQM) of 7.6 on Atari-60. Beyond Atari, we demonstrate BTR’s capability to handle complex 3D games, successfully training agents to play Super Mario Galaxy, Mario Kart, and Mortal Kombat with minimal algorithmic changes. Designing BTR with computational efficiency in mind, agents can be trained using a high-end desktop PC on 200 million Atari frames within 12 hours. Additionally, we conduct detailed ablation studies of each component, analyzing the performance and impact using numerous measures.
APA
Clark, T., Towers, M., Evers, C. & Hare, J.. (2025). Beyond The Rainbow: High Performance Deep Reinforcement Learning on a Desktop PC. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:11064-11091 Available from https://proceedings.mlr.press/v267/clark25a.html.

Related Material