Is regularization unnecessary for boosting?
Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, PMLR R3:129-136, 2001.
Boosting algorithms are often observed to be resistant to overfitting, to a degree that one may wonder whether it is harmless to run the algorithms forever, and whether regularization in on way or another is unnecessary [see, e.g., Schapire (1999); Friedman, Hastie and Tibshirani (1999); Grove and Schuurmans (1998); Mason, Baxter, Bartlett and Frean (1999)]. One may also wonder whether it is possible to adapt the boosting ideas to regression, and whether or not it is possible to avoid the need of regularization by just adopting the boosting device. In this paper we present examples where ’boosting forever’ leads to suboptimal predictions; while some regularization method, on the other hand, can achieve asymptotic optimality, at least in theory. We conjecture that this can be true in more general situations, and for some other regularization methods as well. Therefore the emerging literature on regularized variants of boosting is not unnecessary, but should be encouraged instead. The results of this paper are obtained from an analogy between some boosting algorithms that are used in regression and classification.