An Analysis of Categorical Distributional Reinforcement Learning

Mark Rowland; Marc Bellemare; Will Dabney; Remi Munos; Yee Whye Teh

An Analysis of Categorical Distributional Reinforcement Learning

Mark Rowland, Marc Bellemare, Will Dabney, Remi Munos, Yee Whye Teh

Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:29-37, 2018.

Abstract

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cramer distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v84-rowland18a,
  title = 	 {An Analysis of Categorical Distributional Reinforcement Learning},
  author = 	 {Rowland, Mark and Bellemare, Marc and Dabney, Will and Munos, Remi and Teh, Yee Whye},
  booktitle = 	 {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics},
  pages = 	 {29--37},
  year = 	 {2018},
  editor = 	 {Storkey, Amos and Perez-Cruz, Fernando},
  volume = 	 {84},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--11 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v84/rowland18a/rowland18a.pdf},
  url = 	 {https://proceedings.mlr.press/v84/rowland18a.html},
  abstract = 	 {Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cramer distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.}
}

Endnote

%0 Conference Paper
%T An Analysis of Categorical Distributional Reinforcement Learning
%A Mark Rowland
%A Marc Bellemare
%A Will Dabney
%A Remi Munos
%A Yee Whye Teh
%B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2018
%E Amos Storkey
%E Fernando Perez-Cruz	
%F pmlr-v84-rowland18a
%I PMLR
%P 29--37
%U https://proceedings.mlr.press/v84/rowland18a.html
%V 84
%X Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cramer distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.

APA


Rowland, M., Bellemare, M., Dabney, W., Munos, R. & Teh, Y.W.. (2018). An Analysis of Categorical Distributional Reinforcement Learning. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:29-37 Available from https://proceedings.mlr.press/v84/rowland18a.html.

An Analysis of Categorical Distributional Reinforcement Learning

Abstract

Cite this Paper

Related Material