Consistency Analysis for Binary Classification Revisited

Krzysztof Dembczyński, Wojciech Kotłowski, Oluwasanmi Koyejo, Nagarajan Natarajan
; Proceedings of the 34th International Conference on Machine Learning, PMLR 70:961-969, 2017.

Abstract

Statistical learning theory is at an inflection point enabled by recent advances in understanding and optimizing a wide range of metrics. Of particular interest are non-decomposable metrics such as the F-measure and the Jaccard measure which cannot be represented as a simple average over examples. Non-decomposability is the primary source of difficulty in theoretical analysis, and interestingly has led to two distinct settings and notions of consistency. In this manuscript we analyze both settings, from statistical and algorithmic points of view, to explore the connections and to highlight differences between them for a wide range of metrics. The analysis complements previous results on this topic, clarifies common confusions around both settings, and provides guidance to the theory and practice of binary classification with complex metrics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-dembczynski17a, title = {Consistency Analysis for Binary Classification Revisited}, author = {Krzysztof Dembczy{\'{n}}ski and Wojciech Kot{\l}owski and Oluwasanmi Koyejo and Nagarajan Natarajan}, pages = {961--969}, year = {2017}, editor = {Doina Precup and Yee Whye Teh}, volume = {70}, series = {Proceedings of Machine Learning Research}, address = {International Convention Centre, Sydney, Australia}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/dembczynski17a/dembczynski17a.pdf}, url = {http://proceedings.mlr.press/v70/dembczynski17a.html}, abstract = {Statistical learning theory is at an inflection point enabled by recent advances in understanding and optimizing a wide range of metrics. Of particular interest are non-decomposable metrics such as the F-measure and the Jaccard measure which cannot be represented as a simple average over examples. Non-decomposability is the primary source of difficulty in theoretical analysis, and interestingly has led to two distinct settings and notions of consistency. In this manuscript we analyze both settings, from statistical and algorithmic points of view, to explore the connections and to highlight differences between them for a wide range of metrics. The analysis complements previous results on this topic, clarifies common confusions around both settings, and provides guidance to the theory and practice of binary classification with complex metrics.} }
Endnote
%0 Conference Paper %T Consistency Analysis for Binary Classification Revisited %A Krzysztof Dembczyński %A Wojciech Kotłowski %A Oluwasanmi Koyejo %A Nagarajan Natarajan %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-dembczynski17a %I PMLR %J Proceedings of Machine Learning Research %P 961--969 %U http://proceedings.mlr.press %V 70 %W PMLR %X Statistical learning theory is at an inflection point enabled by recent advances in understanding and optimizing a wide range of metrics. Of particular interest are non-decomposable metrics such as the F-measure and the Jaccard measure which cannot be represented as a simple average over examples. Non-decomposability is the primary source of difficulty in theoretical analysis, and interestingly has led to two distinct settings and notions of consistency. In this manuscript we analyze both settings, from statistical and algorithmic points of view, to explore the connections and to highlight differences between them for a wide range of metrics. The analysis complements previous results on this topic, clarifies common confusions around both settings, and provides guidance to the theory and practice of binary classification with complex metrics.
APA
Dembczyński, K., Kotłowski, W., Koyejo, O. & Natarajan, N.. (2017). Consistency Analysis for Binary Classification Revisited. Proceedings of the 34th International Conference on Machine Learning, in PMLR 70:961-969

Related Material