Provable guarantees for decision tree induction: the agnostic setting

Guy Blanc, Jane Lange, Li-Yang Tan
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:941-949, 2020.

Abstract

We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions $f$ and $s\in \mathbb{N}$, these heuristics construct a decision tree of size $s^{\tilde{O}((\log s)/\varepsilon^2)}$ that achieves error $\le \mathsf{opt}_s + \varepsilon$, where $\mathsf{opt}_s$ denotes the error of the optimal size-$s$ decision tree for $f$. Previously such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching $s^{\tilde{\Omega}(\log s)}$ lower bound.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-blanc20a, title = {Provable guarantees for decision tree induction: the agnostic setting}, author = {Blanc, Guy and Lange, Jane and Tan, Li-Yang}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {941--949}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/blanc20a/blanc20a.pdf}, url = {https://proceedings.mlr.press/v119/blanc20a.html}, abstract = {We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions $f$ and $s\in \mathbb{N}$, these heuristics construct a decision tree of size $s^{\tilde{O}((\log s)/\varepsilon^2)}$ that achieves error $\le \mathsf{opt}_s + \varepsilon$, where $\mathsf{opt}_s$ denotes the error of the optimal size-$s$ decision tree for $f$. Previously such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching $s^{\tilde{\Omega}(\log s)}$ lower bound.} }
Endnote
%0 Conference Paper %T Provable guarantees for decision tree induction: the agnostic setting %A Guy Blanc %A Jane Lange %A Li-Yang Tan %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-blanc20a %I PMLR %P 941--949 %U https://proceedings.mlr.press/v119/blanc20a.html %V 119 %X We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions $f$ and $s\in \mathbb{N}$, these heuristics construct a decision tree of size $s^{\tilde{O}((\log s)/\varepsilon^2)}$ that achieves error $\le \mathsf{opt}_s + \varepsilon$, where $\mathsf{opt}_s$ denotes the error of the optimal size-$s$ decision tree for $f$. Previously such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching $s^{\tilde{\Omega}(\log s)}$ lower bound.
APA
Blanc, G., Lange, J. & Tan, L.. (2020). Provable guarantees for decision tree induction: the agnostic setting. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:941-949 Available from https://proceedings.mlr.press/v119/blanc20a.html.

Related Material