Speed and Sparsity of Regularized Boosting
Proceedings of the Twelth International Conference on Artificial Intelligence and Statistics, PMLR 5:615-622, 2009.
Boosting algorithms with $l_1$ regularization are of interest because $l_1$ regularization leads to sparser composite classifiers. Moreover, Rosset et al. have shown that for separable data, standard $l_p$ regularized loss minimization results in a margin maximizing classifier in the limit as regularization is relaxed. For the case $p=1$, we extend these results by obtaining explicit convergence bounds on the regularization required to yield a margin within prescribed accuracy of the maximum achievable margin. We derive similar rates of convergence for the $\epsilon$-AdaBoost algorithm, in the process providing a new proof that $\epsilon$-AdaBoost is margin maximizing as $\epsilon$ converges to $0$. Because both of these known algorithms are computationally expensive, we introduce a new hybrid algorithm, AdaBoost+$L_1$, that combines the virtues of AdaBoost with the sparsity of $l_1$ regularization in a computationally efficient fashion. We prove that the algorithm is margin maximizing and empirically examine its performance on five datasets.