Implicit Quantile Networks for Distributional Reinforcement Learning

Will Dabney; Georg Ostrovski; David Silver; Remi Munos

Implicit Quantile Networks for Distributional Reinforcement Learning

Will Dabney, Georg Ostrovski, David Silver, Remi Munos

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1096-1105, 2018.

Abstract

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-dabney18a,
  title = 	 {Implicit Quantile Networks for Distributional Reinforcement Learning},
  author =       {Dabney, Will and Ostrovski, Georg and Silver, David and Munos, Remi},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {1096--1105},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/dabney18a/dabney18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/dabney18a.html},
  abstract = 	 {In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.}
}

Endnote

%0 Conference Paper
%T Implicit Quantile Networks for Distributional Reinforcement Learning
%A Will Dabney
%A Georg Ostrovski
%A David Silver
%A Remi Munos
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-dabney18a
%I PMLR
%P 1096--1105
%U https://proceedings.mlr.press/v80/dabney18a.html
%V 80
%X In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm’s implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.

APA


Dabney, W., Ostrovski, G., Silver, D. & Munos, R.. (2018). Implicit Quantile Networks for Distributional Reinforcement Learning. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1096-1105 Available from https://proceedings.mlr.press/v80/dabney18a.html.

Implicit Quantile Networks for Distributional Reinforcement Learning

Abstract

Cite this Paper

Related Material