Popular decision tree algorithms are provably noise tolerant

Guy Blanc, Jane Lange, Ali Malik, Li-Yang Tan
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2091-2106, 2022.

Abstract

Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. Our guarantees hold under the strongest noise model of nasty noise, and we provide near-matching upper and lower bounds on the allowable noise rate. We further show that these algorithms, which are simple and have long been central to everyday machine learning, enjoy provable guarantees in the noisy setting that are unmatched by existing algorithms in the theoretical literature on decision tree learning. Taken together, our results add to an ongoing line of research that seeks to place the empirical success of these practical decision tree algorithms on firm theoretical footing.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-blanc22b, title = {Popular decision tree algorithms are provably noise tolerant}, author = {Blanc, Guy and Lange, Jane and Malik, Ali and Tan, Li-Yang}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {2091--2106}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/blanc22b/blanc22b.pdf}, url = {https://proceedings.mlr.press/v162/blanc22b.html}, abstract = {Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. Our guarantees hold under the strongest noise model of nasty noise, and we provide near-matching upper and lower bounds on the allowable noise rate. We further show that these algorithms, which are simple and have long been central to everyday machine learning, enjoy provable guarantees in the noisy setting that are unmatched by existing algorithms in the theoretical literature on decision tree learning. Taken together, our results add to an ongoing line of research that seeks to place the empirical success of these practical decision tree algorithms on firm theoretical footing.} }
Endnote
%0 Conference Paper %T Popular decision tree algorithms are provably noise tolerant %A Guy Blanc %A Jane Lange %A Ali Malik %A Li-Yang Tan %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-blanc22b %I PMLR %P 2091--2106 %U https://proceedings.mlr.press/v162/blanc22b.html %V 162 %X Using the framework of boosting, we prove that all impurity-based decision tree learning algorithms, including the classic ID3, C4.5, and CART, are highly noise tolerant. Our guarantees hold under the strongest noise model of nasty noise, and we provide near-matching upper and lower bounds on the allowable noise rate. We further show that these algorithms, which are simple and have long been central to everyday machine learning, enjoy provable guarantees in the noisy setting that are unmatched by existing algorithms in the theoretical literature on decision tree learning. Taken together, our results add to an ongoing line of research that seeks to place the empirical success of these practical decision tree algorithms on firm theoretical footing.
APA
Blanc, G., Lange, J., Malik, A. & Tan, L.. (2022). Popular decision tree algorithms are provably noise tolerant. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:2091-2106 Available from https://proceedings.mlr.press/v162/blanc22b.html.

Related Material