Optimal Decision Tree Pruning Revisited: Algorithms and Complexity

Juha Harviainen, Frank Sommer, Manuel Sorge, Stefan Szeider
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:22177-22203, 2025.

Abstract

We present a comprehensive classical and parameterized complexity analysis of decision tree pruning operations, extending recent research on the complexity of learning small decision trees. Thereby, we offer new insights into the computational challenges of decision tree simplification, a crucial aspect of developing interpretable and efficient machine learning models. We focus on fundamental pruning operations of subtree replacement and raising, which are used in heuristics. Surprisingly, while optimal pruning can be performed in polynomial time for subtree replacement, the problem is NP-complete for subtree raising. Therefore, we identify parameters and combinations thereof that lead to fixed-parameter tractability or hardness, establishing a precise borderline between these complexity classes. For example, while subtree raising is hard for small domain size $D$ or number $d$ of features, it can be solved in $D^{2d} \cdot |I|^{O(1)}$ time, where $|I|$ is the input size. We complement our theoretical findings with preliminary experimental results, demonstrating the practical implications of our analysis.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-harviainen25a, title = {Optimal Decision Tree Pruning Revisited: Algorithms and Complexity}, author = {Harviainen, Juha and Sommer, Frank and Sorge, Manuel and Szeider, Stefan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {22177--22203}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/harviainen25a/harviainen25a.pdf}, url = {https://proceedings.mlr.press/v267/harviainen25a.html}, abstract = {We present a comprehensive classical and parameterized complexity analysis of decision tree pruning operations, extending recent research on the complexity of learning small decision trees. Thereby, we offer new insights into the computational challenges of decision tree simplification, a crucial aspect of developing interpretable and efficient machine learning models. We focus on fundamental pruning operations of subtree replacement and raising, which are used in heuristics. Surprisingly, while optimal pruning can be performed in polynomial time for subtree replacement, the problem is NP-complete for subtree raising. Therefore, we identify parameters and combinations thereof that lead to fixed-parameter tractability or hardness, establishing a precise borderline between these complexity classes. For example, while subtree raising is hard for small domain size $D$ or number $d$ of features, it can be solved in $D^{2d} \cdot |I|^{O(1)}$ time, where $|I|$ is the input size. We complement our theoretical findings with preliminary experimental results, demonstrating the practical implications of our analysis.} }
Endnote
%0 Conference Paper %T Optimal Decision Tree Pruning Revisited: Algorithms and Complexity %A Juha Harviainen %A Frank Sommer %A Manuel Sorge %A Stefan Szeider %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-harviainen25a %I PMLR %P 22177--22203 %U https://proceedings.mlr.press/v267/harviainen25a.html %V 267 %X We present a comprehensive classical and parameterized complexity analysis of decision tree pruning operations, extending recent research on the complexity of learning small decision trees. Thereby, we offer new insights into the computational challenges of decision tree simplification, a crucial aspect of developing interpretable and efficient machine learning models. We focus on fundamental pruning operations of subtree replacement and raising, which are used in heuristics. Surprisingly, while optimal pruning can be performed in polynomial time for subtree replacement, the problem is NP-complete for subtree raising. Therefore, we identify parameters and combinations thereof that lead to fixed-parameter tractability or hardness, establishing a precise borderline between these complexity classes. For example, while subtree raising is hard for small domain size $D$ or number $d$ of features, it can be solved in $D^{2d} \cdot |I|^{O(1)}$ time, where $|I|$ is the input size. We complement our theoretical findings with preliminary experimental results, demonstrating the practical implications of our analysis.
APA
Harviainen, J., Sommer, F., Sorge, M. & Szeider, S.. (2025). Optimal Decision Tree Pruning Revisited: Algorithms and Complexity. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:22177-22203 Available from https://proceedings.mlr.press/v267/harviainen25a.html.

Related Material