Why Averaging Classifiers can Protect Against Overfitting

Yoav Freund, Yishay Mansour, Robert E. Schapire
Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, PMLR R3:98-105, 2001.

Abstract

We study a simple learning algorithm for binary classification. Instead of predicting with the best hypothesis in the hypothesis class, this algorithm predicts with a weighted average of all hypotheses, weighted exponentially with respect to their training error. We show that the prediction of this algorithm is much more stable than the prediction of an algorithm that predicts with the best hypothesis. By allowing the algorithm to abstain from predicting on some examples, we show that the predictions it makes when it does not abstain are very reliable. Finally, we show that the probability that the algorithm abstains is comparable to the generalization error of the best hypothesis in the class.

Cite this Paper


BibTeX
@InProceedings{pmlr-vR3-freund01a, title = {Why Averaging Classifiers can Protect Against Overfitting}, author = {Freund, Yoav and Mansour, Yishay and Schapire, Robert E.}, booktitle = {Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics}, pages = {98--105}, year = {2001}, editor = {Richardson, Thomas S. and Jaakkola, Tommi S.}, volume = {R3}, series = {Proceedings of Machine Learning Research}, month = {04--07 Jan}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/r3/freund01a/freund01a.pdf}, url = {https://proceedings.mlr.press/r3/freund01a.html}, abstract = {We study a simple learning algorithm for binary classification. Instead of predicting with the best hypothesis in the hypothesis class, this algorithm predicts with a weighted average of all hypotheses, weighted exponentially with respect to their training error. We show that the prediction of this algorithm is much more stable than the prediction of an algorithm that predicts with the best hypothesis. By allowing the algorithm to abstain from predicting on some examples, we show that the predictions it makes when it does not abstain are very reliable. Finally, we show that the probability that the algorithm abstains is comparable to the generalization error of the best hypothesis in the class.}, note = {Reissued by PMLR on 31 March 2021.} }
Endnote
%0 Conference Paper %T Why Averaging Classifiers can Protect Against Overfitting %A Yoav Freund %A Yishay Mansour %A Robert E. Schapire %B Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2001 %E Thomas S. Richardson %E Tommi S. Jaakkola %F pmlr-vR3-freund01a %I PMLR %P 98--105 %U https://proceedings.mlr.press/r3/freund01a.html %V R3 %X We study a simple learning algorithm for binary classification. Instead of predicting with the best hypothesis in the hypothesis class, this algorithm predicts with a weighted average of all hypotheses, weighted exponentially with respect to their training error. We show that the prediction of this algorithm is much more stable than the prediction of an algorithm that predicts with the best hypothesis. By allowing the algorithm to abstain from predicting on some examples, we show that the predictions it makes when it does not abstain are very reliable. Finally, we show that the probability that the algorithm abstains is comparable to the generalization error of the best hypothesis in the class. %Z Reissued by PMLR on 31 March 2021.
APA
Freund, Y., Mansour, Y. & Schapire, R.E.. (2001). Why Averaging Classifiers can Protect Against Overfitting. Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research R3:98-105 Available from https://proceedings.mlr.press/r3/freund01a.html. Reissued by PMLR on 31 March 2021.

Related Material