Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features

Rahul Mazumder, Xiang Meng, Haoyue Wang
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:15255-15277, 2022.

Abstract

Decision trees are one of the most useful and popular methods in the machine learning toolbox. In this paper, we consider the problem of learning optimal decision trees, a combinatorial optimization problem that is challenging to solve at scale. A common approach in the literature is to use greedy heuristics, which may not be optimal. Recently there has been significant interest in learning optimal decision trees using various approaches (e.g., based on integer programming, dynamic programming)—to achieve computational scalability, most of these approaches focus on classification tasks with binary features. In this paper, we present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees. Different from existing customized approaches, we consider both regression and classification tasks with continuous features. The basic idea underlying our approach is to split the search space based on the quantiles of the feature distribution—leading to upper and lower bounds for the underlying optimization problem along the BnB iterations. Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-mazumder22a, title = {Quant-{B}n{B}: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features}, author = {Mazumder, Rahul and Meng, Xiang and Wang, Haoyue}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {15255--15277}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/mazumder22a/mazumder22a.pdf}, url = {https://proceedings.mlr.press/v162/mazumder22a.html}, abstract = {Decision trees are one of the most useful and popular methods in the machine learning toolbox. In this paper, we consider the problem of learning optimal decision trees, a combinatorial optimization problem that is challenging to solve at scale. A common approach in the literature is to use greedy heuristics, which may not be optimal. Recently there has been significant interest in learning optimal decision trees using various approaches (e.g., based on integer programming, dynamic programming)—to achieve computational scalability, most of these approaches focus on classification tasks with binary features. In this paper, we present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees. Different from existing customized approaches, we consider both regression and classification tasks with continuous features. The basic idea underlying our approach is to split the search space based on the quantiles of the feature distribution—leading to upper and lower bounds for the underlying optimization problem along the BnB iterations. Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.} }
Endnote
%0 Conference Paper %T Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features %A Rahul Mazumder %A Xiang Meng %A Haoyue Wang %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-mazumder22a %I PMLR %P 15255--15277 %U https://proceedings.mlr.press/v162/mazumder22a.html %V 162 %X Decision trees are one of the most useful and popular methods in the machine learning toolbox. In this paper, we consider the problem of learning optimal decision trees, a combinatorial optimization problem that is challenging to solve at scale. A common approach in the literature is to use greedy heuristics, which may not be optimal. Recently there has been significant interest in learning optimal decision trees using various approaches (e.g., based on integer programming, dynamic programming)—to achieve computational scalability, most of these approaches focus on classification tasks with binary features. In this paper, we present a new discrete optimization method based on branch-and-bound (BnB) to obtain optimal decision trees. Different from existing customized approaches, we consider both regression and classification tasks with continuous features. The basic idea underlying our approach is to split the search space based on the quantiles of the feature distribution—leading to upper and lower bounds for the underlying optimization problem along the BnB iterations. Our proposed algorithm Quant-BnB shows significant speedups compared to existing approaches for shallow optimal trees on various real datasets.
APA
Mazumder, R., Meng, X. & Wang, H.. (2022). Quant-BnB: A Scalable Branch-and-Bound Method for Optimal Decision Trees with Continuous Features. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:15255-15277 Available from https://proceedings.mlr.press/v162/mazumder22a.html.

Related Material