Solving Offline Reinforcement Learning with Decision Tree Regression

Prajwal Koirala, Cody Fleming
Proceedings of The 8th Conference on Robot Learning, PMLR 270:2147-2163, 2025.

Abstract

This study presents a novel approach to addressing offline reinforcement learning (RL) problems by reframing them as regression tasks that can be effectively solved using Decision Trees. Mainly, we introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies (RCDTP and RWDTP), both of which achieve notable speed in agent training as well as inference, with training typically lasting less than a few minutes. Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods. We evaluate our methods on D4RL datasets for locomotion and manipulation, as well as other robotic tasks involving wheeled and flying robots. Additionally, we assess performance in delayed/sparse reward scenarios and highlight the explainability of these policies through action distribution and feature importance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-koirala25a, title = {Solving Offline Reinforcement Learning with Decision Tree Regression}, author = {Koirala, Prajwal and Fleming, Cody}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {2147--2163}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/koirala25a/koirala25a.pdf}, url = {https://proceedings.mlr.press/v270/koirala25a.html}, abstract = {This study presents a novel approach to addressing offline reinforcement learning (RL) problems by reframing them as regression tasks that can be effectively solved using Decision Trees. Mainly, we introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies (RCDTP and RWDTP), both of which achieve notable speed in agent training as well as inference, with training typically lasting less than a few minutes. Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods. We evaluate our methods on D4RL datasets for locomotion and manipulation, as well as other robotic tasks involving wheeled and flying robots. Additionally, we assess performance in delayed/sparse reward scenarios and highlight the explainability of these policies through action distribution and feature importance.} }
Endnote
%0 Conference Paper %T Solving Offline Reinforcement Learning with Decision Tree Regression %A Prajwal Koirala %A Cody Fleming %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-koirala25a %I PMLR %P 2147--2163 %U https://proceedings.mlr.press/v270/koirala25a.html %V 270 %X This study presents a novel approach to addressing offline reinforcement learning (RL) problems by reframing them as regression tasks that can be effectively solved using Decision Trees. Mainly, we introduce two distinct frameworks: return-conditioned and return-weighted decision tree policies (RCDTP and RWDTP), both of which achieve notable speed in agent training as well as inference, with training typically lasting less than a few minutes. Despite the simplification inherent in this reformulated approach to offline RL, our agents demonstrate performance that is at least on par with the established methods. We evaluate our methods on D4RL datasets for locomotion and manipulation, as well as other robotic tasks involving wheeled and flying robots. Additionally, we assess performance in delayed/sparse reward scenarios and highlight the explainability of these policies through action distribution and feature importance.
APA
Koirala, P. & Fleming, C.. (2025). Solving Offline Reinforcement Learning with Decision Tree Regression. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:2147-2163 Available from https://proceedings.mlr.press/v270/koirala25a.html.

Related Material