Gradient Boosting Reinforcement Learning

Benjamin Fuhrer, Chen Tessler, Gal Dalal
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:17960-17985, 2025.

Abstract

We present Gradient Boosting Reinforcement Learning (GBRL), a framework that adapts the strengths of Gradient Boosting Trees (GBT) to reinforcement learning (RL) tasks. While neural networks (NNs) have become the de facto choice for RL, they face significant challenges with structured and categorical features and tend to generalize poorly to out-of-distribution samples. These are challenges for which GBTs have traditionally excelled in supervised learning. However, GBT’s application in RL has been limited. The design of traditional GBT libraries is optimized for static datasets with fixed labels, making them incompatible with RL’s dynamic nature, where both state distributions and reward signals evolve during training. GBRL overcomes this limitation by continuously interleaving tree construction with environment interaction. Through extensive experiments, we demonstrate that GBRL outperforms NNs in domains with structured observations and categorical features, while maintaining competitive performance on standard continuous control benchmarks. Like its supervised learning counterpart, GBRL demonstrates superior robustness to out-of-distribution samples and better handles irregular state-action relationships.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-fuhrer25a, title = {Gradient Boosting Reinforcement Learning}, author = {Fuhrer, Benjamin and Tessler, Chen and Dalal, Gal}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {17960--17985}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/fuhrer25a/fuhrer25a.pdf}, url = {https://proceedings.mlr.press/v267/fuhrer25a.html}, abstract = {We present Gradient Boosting Reinforcement Learning (GBRL), a framework that adapts the strengths of Gradient Boosting Trees (GBT) to reinforcement learning (RL) tasks. While neural networks (NNs) have become the de facto choice for RL, they face significant challenges with structured and categorical features and tend to generalize poorly to out-of-distribution samples. These are challenges for which GBTs have traditionally excelled in supervised learning. However, GBT’s application in RL has been limited. The design of traditional GBT libraries is optimized for static datasets with fixed labels, making them incompatible with RL’s dynamic nature, where both state distributions and reward signals evolve during training. GBRL overcomes this limitation by continuously interleaving tree construction with environment interaction. Through extensive experiments, we demonstrate that GBRL outperforms NNs in domains with structured observations and categorical features, while maintaining competitive performance on standard continuous control benchmarks. Like its supervised learning counterpart, GBRL demonstrates superior robustness to out-of-distribution samples and better handles irregular state-action relationships.} }
Endnote
%0 Conference Paper %T Gradient Boosting Reinforcement Learning %A Benjamin Fuhrer %A Chen Tessler %A Gal Dalal %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-fuhrer25a %I PMLR %P 17960--17985 %U https://proceedings.mlr.press/v267/fuhrer25a.html %V 267 %X We present Gradient Boosting Reinforcement Learning (GBRL), a framework that adapts the strengths of Gradient Boosting Trees (GBT) to reinforcement learning (RL) tasks. While neural networks (NNs) have become the de facto choice for RL, they face significant challenges with structured and categorical features and tend to generalize poorly to out-of-distribution samples. These are challenges for which GBTs have traditionally excelled in supervised learning. However, GBT’s application in RL has been limited. The design of traditional GBT libraries is optimized for static datasets with fixed labels, making them incompatible with RL’s dynamic nature, where both state distributions and reward signals evolve during training. GBRL overcomes this limitation by continuously interleaving tree construction with environment interaction. Through extensive experiments, we demonstrate that GBRL outperforms NNs in domains with structured observations and categorical features, while maintaining competitive performance on standard continuous control benchmarks. Like its supervised learning counterpart, GBRL demonstrates superior robustness to out-of-distribution samples and better handles irregular state-action relationships.
APA
Fuhrer, B., Tessler, C. & Dalal, G.. (2025). Gradient Boosting Reinforcement Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:17960-17985 Available from https://proceedings.mlr.press/v267/fuhrer25a.html.

Related Material