Rethinking Fano’s Inequality in Ensemble Learning

Terufumi Morishita, Gaku Morio, Shota Horiguchi, Hiroaki Ozaki, Nobuo Nukaga
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:15976-16016, 2022.

Abstract

We propose a fundamental theory on ensemble learning that evaluates a given ensemble system by a well-grounded set of metrics. Previous studies used a variant of Fano’s inequality of information theory and derived a lower bound of the classification error rate on the basis of the accuracy and diversity of models. We revisit the original Fano’s inequality and argue that the studies did not take into account the information lost when multiple model predictions are combined into a final prediction. To address this issue, we generalize the previous theory to incorporate the information loss. Further, we empirically validate and demonstrate the proposed theory through extensive experiments on actual systems. The theory reveals the strengths and weaknesses of systems on each metric, which will push the theoretical understanding of ensemble learning and give us insights into designing systems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-morishita22a, title = {Rethinking Fano’s Inequality in Ensemble Learning}, author = {Morishita, Terufumi and Morio, Gaku and Horiguchi, Shota and Ozaki, Hiroaki and Nukaga, Nobuo}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {15976--16016}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/morishita22a/morishita22a.pdf}, url = {https://proceedings.mlr.press/v162/morishita22a.html}, abstract = {We propose a fundamental theory on ensemble learning that evaluates a given ensemble system by a well-grounded set of metrics. Previous studies used a variant of Fano’s inequality of information theory and derived a lower bound of the classification error rate on the basis of the accuracy and diversity of models. We revisit the original Fano’s inequality and argue that the studies did not take into account the information lost when multiple model predictions are combined into a final prediction. To address this issue, we generalize the previous theory to incorporate the information loss. Further, we empirically validate and demonstrate the proposed theory through extensive experiments on actual systems. The theory reveals the strengths and weaknesses of systems on each metric, which will push the theoretical understanding of ensemble learning and give us insights into designing systems.} }
Endnote
%0 Conference Paper %T Rethinking Fano’s Inequality in Ensemble Learning %A Terufumi Morishita %A Gaku Morio %A Shota Horiguchi %A Hiroaki Ozaki %A Nobuo Nukaga %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-morishita22a %I PMLR %P 15976--16016 %U https://proceedings.mlr.press/v162/morishita22a.html %V 162 %X We propose a fundamental theory on ensemble learning that evaluates a given ensemble system by a well-grounded set of metrics. Previous studies used a variant of Fano’s inequality of information theory and derived a lower bound of the classification error rate on the basis of the accuracy and diversity of models. We revisit the original Fano’s inequality and argue that the studies did not take into account the information lost when multiple model predictions are combined into a final prediction. To address this issue, we generalize the previous theory to incorporate the information loss. Further, we empirically validate and demonstrate the proposed theory through extensive experiments on actual systems. The theory reveals the strengths and weaknesses of systems on each metric, which will push the theoretical understanding of ensemble learning and give us insights into designing systems.
APA
Morishita, T., Morio, G., Horiguchi, S., Ozaki, H. & Nukaga, N.. (2022). Rethinking Fano’s Inequality in Ensemble Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:15976-16016 Available from https://proceedings.mlr.press/v162/morishita22a.html.

Related Material