Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft

Christian Scheller; Yanick Schraner; Manfred Vogel

Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft

Christian Scheller, Yanick Schraner, Manfred Vogel

Proceedings of the NeurIPS 2019 Competition and Demonstration Track, PMLR 123:67-76, 2020.

Abstract

Sample inefficiency of deep reinforcement learning methods is a major obstacle for their use in real-world applications. In this work, we show how human demonstrations can improve final performance of agents on the Minecraft minigame ObtainDiamond with only 8M frames of environment interaction. We propose a training procedure where policy networks are first trained on human data and later fine-tuned by reinforcement learning. Using a policy exploitation mechanism, experience replay and an additional loss against catastrophic forgetting, our best agent was able to achieve a mean score of 48. Our proposed solution placed 3rd in the NeurIPS MineRL Competition for Sample-Efficient Reinforcement Learning.

Cite this Paper

BibTeX


@InProceedings{pmlr-v123-scheller20a,
  title = 	 {Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft},
  author =       {Scheller, Christian and Schraner, Yanick and Vogel, Manfred},
  booktitle = 	 {Proceedings of the NeurIPS 2019 Competition and Demonstration Track},
  pages = 	 {67--76},
  year = 	 {2020},
  editor = 	 {Escalante, Hugo Jair and Hadsell, Raia},
  volume = 	 {123},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--14 Dec},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v123/scheller20a/scheller20a.pdf},
  url = 	 {https://proceedings.mlr.press/v123/scheller20a.html},
  abstract = 	 {  Sample inefficiency of deep reinforcement learning methods is a major obstacle for their use in real-world applications.  In this work, we show how human demonstrations can improve final performance of agents on the Minecraft minigame ObtainDiamond with only 8M frames of environment interaction.  We propose a training procedure where policy networks are first trained on human data and later fine-tuned by reinforcement learning.  Using a policy exploitation mechanism, experience replay and an additional loss against catastrophic forgetting, our best agent was able to achieve a mean score of 48.  Our proposed solution placed 3rd in the NeurIPS MineRL Competition for Sample-Efficient Reinforcement Learning.}
}

Endnote

%0 Conference Paper
%T Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
%A Christian Scheller
%A Yanick Schraner
%A Manfred Vogel
%B Proceedings of the NeurIPS 2019 Competition and Demonstration Track
%C Proceedings of Machine Learning Research
%D 2020
%E Hugo Jair Escalante
%E Raia Hadsell	
%F pmlr-v123-scheller20a
%I PMLR
%P 67--76
%U https://proceedings.mlr.press/v123/scheller20a.html
%V 123
%X   Sample inefficiency of deep reinforcement learning methods is a major obstacle for their use in real-world applications.  In this work, we show how human demonstrations can improve final performance of agents on the Minecraft minigame ObtainDiamond with only 8M frames of environment interaction.  We propose a training procedure where policy networks are first trained on human data and later fine-tuned by reinforcement learning.  Using a policy exploitation mechanism, experience replay and an additional loss against catastrophic forgetting, our best agent was able to achieve a mean score of 48.  Our proposed solution placed 3rd in the NeurIPS MineRL Competition for Sample-Efficient Reinforcement Learning.

APA


Scheller, C., Schraner, Y. & Vogel, M.. (2020). Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft. Proceedings of the NeurIPS 2019 Competition and Demonstration Track, in Proceedings of Machine Learning Research 123:67-76 Available from https://proceedings.mlr.press/v123/scheller20a.html.

Related Material

Download PDF