A Column Generation Bound Minimization Approach with PAC-Bayesian Generalization Guarantees

[edit]

Jean-Francis Roy, Mario Marchand, François Laviolette ;
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1241-1249, 2016.

Abstract

The C-bound, introduced in Lacasse et al (2006), gives a tight upper bound on the risk of the majority vote classifier. Laviolette et al. (2011) designed a learning algorithm named MinCq that outputs a dense distribution on a finite set of base classifiers by minimizing the C-bound, together with a PAC-Bayesian generalization guarantee. In this work, we design a column generation algorithm that we call CqBoost, that optimizes the C-bound and outputs a sparse distribution on a possibly infinite set of voters. We also propose a PAC-Bayesian bound for CqBoost that holds for finite and two cases of continuous sets of base classifiers. Finally, compare the accuracy and the sparsity of CqBoost with MinCq and other state-of-the-art boosting algorithms.

Related Material