Distributional Bellman Operators over Mean Embeddings

Li Kevin Wenliang, Gregoire Deletang, Matthew Aitchison, Marcus Hutter, Anian Ruoss, Arthur Gretton, Mark Rowland
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:52839-52868, 2024.

Abstract

We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. The framework reveals a wide variety of new algorithms for dynamic programming and temporal-difference algorithms that rely on the sketch Bellman operator, which updates mean embeddings with simple linear-algebraic computations. We provide asymptotic convergence theory, and examine the empirical performance of the algorithms on a suite of tabular tasks. Further, we show that this approach can be straightforwardly combined with deep reinforcement learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-wenliang24a, title = {Distributional {B}ellman Operators over Mean Embeddings}, author = {Wenliang, Li Kevin and Deletang, Gregoire and Aitchison, Matthew and Hutter, Marcus and Ruoss, Anian and Gretton, Arthur and Rowland, Mark}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {52839--52868}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/wenliang24a/wenliang24a.pdf}, url = {https://proceedings.mlr.press/v235/wenliang24a.html}, abstract = {We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. The framework reveals a wide variety of new algorithms for dynamic programming and temporal-difference algorithms that rely on the sketch Bellman operator, which updates mean embeddings with simple linear-algebraic computations. We provide asymptotic convergence theory, and examine the empirical performance of the algorithms on a suite of tabular tasks. Further, we show that this approach can be straightforwardly combined with deep reinforcement learning.} }
Endnote
%0 Conference Paper %T Distributional Bellman Operators over Mean Embeddings %A Li Kevin Wenliang %A Gregoire Deletang %A Matthew Aitchison %A Marcus Hutter %A Anian Ruoss %A Arthur Gretton %A Mark Rowland %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-wenliang24a %I PMLR %P 52839--52868 %U https://proceedings.mlr.press/v235/wenliang24a.html %V 235 %X We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions. The framework reveals a wide variety of new algorithms for dynamic programming and temporal-difference algorithms that rely on the sketch Bellman operator, which updates mean embeddings with simple linear-algebraic computations. We provide asymptotic convergence theory, and examine the empirical performance of the algorithms on a suite of tabular tasks. Further, we show that this approach can be straightforwardly combined with deep reinforcement learning.
APA
Wenliang, L.K., Deletang, G., Aitchison, M., Hutter, M., Ruoss, A., Gretton, A. & Rowland, M.. (2024). Distributional Bellman Operators over Mean Embeddings. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:52839-52868 Available from https://proceedings.mlr.press/v235/wenliang24a.html.

Related Material