Boosting, Voting Classifiers and Randomized Sample Compression Schemes

Arthur da Cunha, Kasper Green Larsen, Martin Ritzert
Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:390-404, 2025.

Abstract

In Boosting, we aim to leverage multiple weak learners to produce a strong learner. At the center of this paradigm lies the concept of building the strong learner as a voting classifier, which outputs a weighted majority vote of the weak learners. While many successful Boosting algorithms, such as the iconic AdaBoost, produce voting classifiers, their theoretical performance has long remained sub-optimal: The best known bounds on the number of training examples necessary for a voting classifier to obtain a given accuracy has so far always contained at least two logarithmic factors above what is known to be achievable by general weak-to-strong learners. In this work, we break this barrier by proposing a randomized Boosting algorithm that outputs voting classifiers whose generalization error contains a single logarithmic dependency on the sample size. We obtain this result by building a general framework that extends sample compression methods to support randomized learning algorithms based on sub-sampling.

Cite this Paper


BibTeX
@InProceedings{pmlr-v272-cunha25a, title = {Boosting, Voting Classifiers and Randomized Sample Compression Schemes}, author = {da Cunha, Arthur and Larsen, Kasper Green and Ritzert, Martin}, booktitle = {Proceedings of The 36th International Conference on Algorithmic Learning Theory}, pages = {390--404}, year = {2025}, editor = {Kamath, Gautam and Loh, Po-Ling}, volume = {272}, series = {Proceedings of Machine Learning Research}, month = {24--27 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v272/main/assets/cunha25a/cunha25a.pdf}, url = {https://proceedings.mlr.press/v272/cunha25a.html}, abstract = {In Boosting, we aim to leverage multiple weak learners to produce a strong learner. At the center of this paradigm lies the concept of building the strong learner as a voting classifier, which outputs a weighted majority vote of the weak learners. While many successful Boosting algorithms, such as the iconic AdaBoost, produce voting classifiers, their theoretical performance has long remained sub-optimal: The best known bounds on the number of training examples necessary for a voting classifier to obtain a given accuracy has so far always contained at least two logarithmic factors above what is known to be achievable by general weak-to-strong learners. In this work, we break this barrier by proposing a randomized Boosting algorithm that outputs voting classifiers whose generalization error contains a single logarithmic dependency on the sample size. We obtain this result by building a general framework that extends sample compression methods to support randomized learning algorithms based on sub-sampling.} }
Endnote
%0 Conference Paper %T Boosting, Voting Classifiers and Randomized Sample Compression Schemes %A Arthur da Cunha %A Kasper Green Larsen %A Martin Ritzert %B Proceedings of The 36th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Gautam Kamath %E Po-Ling Loh %F pmlr-v272-cunha25a %I PMLR %P 390--404 %U https://proceedings.mlr.press/v272/cunha25a.html %V 272 %X In Boosting, we aim to leverage multiple weak learners to produce a strong learner. At the center of this paradigm lies the concept of building the strong learner as a voting classifier, which outputs a weighted majority vote of the weak learners. While many successful Boosting algorithms, such as the iconic AdaBoost, produce voting classifiers, their theoretical performance has long remained sub-optimal: The best known bounds on the number of training examples necessary for a voting classifier to obtain a given accuracy has so far always contained at least two logarithmic factors above what is known to be achievable by general weak-to-strong learners. In this work, we break this barrier by proposing a randomized Boosting algorithm that outputs voting classifiers whose generalization error contains a single logarithmic dependency on the sample size. We obtain this result by building a general framework that extends sample compression methods to support randomized learning algorithms based on sub-sampling.
APA
da Cunha, A., Larsen, K.G. & Ritzert, M.. (2025). Boosting, Voting Classifiers and Randomized Sample Compression Schemes. Proceedings of The 36th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 272:390-404 Available from https://proceedings.mlr.press/v272/cunha25a.html.

Related Material