Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

Aleksei Petrenko, Zhehui Huang, Tushar Kumar, Gaurav Sukhatme, Vladlen Koltun
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:7652-7662, 2020.

Abstract

Increasing the scale of reinforcement learning experiments has allowed researchers to achieve unprecedented results in both training sophisticated agents for video games, and in sim-to-real transfer for robotics. Typically such experiments rely on large distributed systems and require expensive hardware setups, limiting wider access to this exciting area of research. In this work we aim to solve this problem by optimizing the efficiency and resource utilization of reinforcement learning algorithms instead of relying on distributed computation. We present the "Sample Factory", a high-throughput training system optimized for a single-machine setting. Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques, allowing us to achieve throughput higher than $10^5$ environment frames/second on non-trivial control problems in 3D without sacrificing sample efficiency. We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game. Github: https://github.com/alex-petrenko/sample-factory

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-petrenko20a, title = {Sample Factory: Egocentric 3{D} Control from Pixels at 100000 {FPS} with Asynchronous Reinforcement Learning}, author = {Petrenko, Aleksei and Huang, Zhehui and Kumar, Tushar and Sukhatme, Gaurav and Koltun, Vladlen}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {7652--7662}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/petrenko20a/petrenko20a.pdf}, url = {https://proceedings.mlr.press/v119/petrenko20a.html}, abstract = {Increasing the scale of reinforcement learning experiments has allowed researchers to achieve unprecedented results in both training sophisticated agents for video games, and in sim-to-real transfer for robotics. Typically such experiments rely on large distributed systems and require expensive hardware setups, limiting wider access to this exciting area of research. In this work we aim to solve this problem by optimizing the efficiency and resource utilization of reinforcement learning algorithms instead of relying on distributed computation. We present the "Sample Factory", a high-throughput training system optimized for a single-machine setting. Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques, allowing us to achieve throughput higher than $10^5$ environment frames/second on non-trivial control problems in 3D without sacrificing sample efficiency. We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game. Github: https://github.com/alex-petrenko/sample-factory} }
Endnote
%0 Conference Paper %T Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning %A Aleksei Petrenko %A Zhehui Huang %A Tushar Kumar %A Gaurav Sukhatme %A Vladlen Koltun %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-petrenko20a %I PMLR %P 7652--7662 %U https://proceedings.mlr.press/v119/petrenko20a.html %V 119 %X Increasing the scale of reinforcement learning experiments has allowed researchers to achieve unprecedented results in both training sophisticated agents for video games, and in sim-to-real transfer for robotics. Typically such experiments rely on large distributed systems and require expensive hardware setups, limiting wider access to this exciting area of research. In this work we aim to solve this problem by optimizing the efficiency and resource utilization of reinforcement learning algorithms instead of relying on distributed computation. We present the "Sample Factory", a high-throughput training system optimized for a single-machine setting. Our architecture combines a highly efficient, asynchronous, GPU-based sampler with off-policy correction techniques, allowing us to achieve throughput higher than $10^5$ environment frames/second on non-trivial control problems in 3D without sacrificing sample efficiency. We extend Sample Factory to support self-play and population-based training and apply these techniques to train highly capable agents for a multiplayer first-person shooter game. Github: https://github.com/alex-petrenko/sample-factory
APA
Petrenko, A., Huang, Z., Kumar, T., Sukhatme, G. & Koltun, V.. (2020). Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:7652-7662 Available from https://proceedings.mlr.press/v119/petrenko20a.html.

Related Material