Speed and Sparsity of Regularized Boosting

[edit]

Yongxin Xi, Zhen Xiang, Peter Ramadge, Robert Schapire ;
Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, PMLR 5:615-622, 2009.

Abstract

Boosting algorithms with l1 regularization are of interest because l1 regularization leads to sparser composite classifiers. Moreover, Rosset et al. have shown that for separable data, standard lp regularized loss minimization results in a margin maximizing classifier in the limit as regularization is relaxed. For the case p=1, we extend these results by obtaining explicit convergence bounds on the regularization required to yield a margin within prescribed accuracy of the maximum achievable margin. We derive similar rates of convergence for the epsilon AdaBoost algorithm, in the process providing a new proof that epsilon AdaBoost is margin maximizing as epsilon converges to 0. Because both of these known algorithms are computationally expensive, we introduce a new hybrid algorithm, AdaBoost+L1, that combines the virtues of AdaBoost with the sparsity of l1 regularization in a computationally efficient fashion. We prove that the algorithm is margin maximizing and empirically examine its performance on five datasets.

Related Material