Consistency versus Realizable H-Consistency for Multiclass Classification

Phil Long; Rocco Servedio

Consistency versus Realizable H-Consistency for Multiclass Classification

Phil Long, Rocco Servedio

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):801-809, 2013.

Abstract

A consistent loss function for multiclass classification is one such that for any source of labeled examples, any tuple of scoring functions that minimizes the expected loss will have classification accuracy close to that of the Bayes optimal classifier. While consistency has been proposed as a desirable property for multiclass loss functions, we give experimental and theoretical results exhibiting a sequence of linearly separable data sources with the following property: a multiclass classification algorithm which optimizes a loss function due to Crammer and Singer (which is known not to be consistent) produces classifiers whose expected error goes to 0, while the expected error of an algorithm which optimizes a generalization of the loss function used by LogitBoost (a loss function which is known to be consistent) is bounded below by a positive constant. We identify a property of a loss function, realizable consistency with respect to a restricted class of scoring functions, that accounts for this difference. As our main technical results we show that the Crammer–Singer loss function is realizable consistent for the class of linear scoring functions, while the generalization of LogitBoost is not. Our result for LogitBoost is a special case of a more general theorem that applies to several other loss functions that have been proposed for multiclass classification.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-long13,
  title = 	 {Consistency versus Realizable H-Consistency for Multiclass Classification},
  author = 	 {Long, Phil and Servedio, Rocco},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {801--809},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/long13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/long13.html},
  abstract = 	 {A consistent loss function for multiclass classification is one such  that for any source of labeled examples, any tuple  of scoring functions that  minimizes the expected loss will have classification accuracy close to that  of the Bayes optimal classifier. While consistency has been proposed as a  desirable property for multiclass loss functions, we  give experimental and theoretical results exhibiting a  sequence of linearly separable data sources with the following property:  a multiclass classification algorithm which optimizes a loss function  due to Crammer and Singer (which is known not to be consistent) produces  classifiers whose expected error goes to 0, while the expected error  of an algorithm which optimizes a generalization of the loss  function used by LogitBoost (a loss function which is known to be consistent)  is bounded below by a positive constant.    We identify a property of a loss function, realizable  consistency with respect to a restricted class of scoring functions,  that accounts for this difference.   As our main technical results we show  that the Crammer–Singer loss function is  realizable consistent for the class of linear scoring functions, while  the generalization of LogitBoost is not.  Our result for LogitBoost is  a special case of a more general theorem that applies to several other  loss functions that have been proposed for multiclass classification.  }
}

Endnote

%0 Conference Paper
%T Consistency versus Realizable H-Consistency for Multiclass Classification
%A Phil Long
%A Rocco Servedio
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-long13
%I PMLR
%P 801--809
%U https://proceedings.mlr.press/v28/long13.html
%V 28
%N 3
%X A consistent loss function for multiclass classification is one such  that for any source of labeled examples, any tuple  of scoring functions that  minimizes the expected loss will have classification accuracy close to that  of the Bayes optimal classifier. While consistency has been proposed as a  desirable property for multiclass loss functions, we  give experimental and theoretical results exhibiting a  sequence of linearly separable data sources with the following property:  a multiclass classification algorithm which optimizes a loss function  due to Crammer and Singer (which is known not to be consistent) produces  classifiers whose expected error goes to 0, while the expected error  of an algorithm which optimizes a generalization of the loss  function used by LogitBoost (a loss function which is known to be consistent)  is bounded below by a positive constant.    We identify a property of a loss function, realizable  consistency with respect to a restricted class of scoring functions,  that accounts for this difference.   As our main technical results we show  that the Crammer–Singer loss function is  realizable consistent for the class of linear scoring functions, while  the generalization of LogitBoost is not.  Our result for LogitBoost is  a special case of a more general theorem that applies to several other  loss functions that have been proposed for multiclass classification.

RIS


TY  - CPAPER
TI  - Consistency versus Realizable H-Consistency for Multiclass Classification
AU  - Phil Long
AU  - Rocco Servedio
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-long13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 801
EP  - 809
L1  - http://proceedings.mlr.press/v28/long13.pdf
UR  - https://proceedings.mlr.press/v28/long13.html
AB  - A consistent loss function for multiclass classification is one such  that for any source of labeled examples, any tuple  of scoring functions that  minimizes the expected loss will have classification accuracy close to that  of the Bayes optimal classifier. While consistency has been proposed as a  desirable property for multiclass loss functions, we  give experimental and theoretical results exhibiting a  sequence of linearly separable data sources with the following property:  a multiclass classification algorithm which optimizes a loss function  due to Crammer and Singer (which is known not to be consistent) produces  classifiers whose expected error goes to 0, while the expected error  of an algorithm which optimizes a generalization of the loss  function used by LogitBoost (a loss function which is known to be consistent)  is bounded below by a positive constant.    We identify a property of a loss function, realizable  consistency with respect to a restricted class of scoring functions,  that accounts for this difference.   As our main technical results we show  that the Crammer–Singer loss function is  realizable consistent for the class of linear scoring functions, while  the generalization of LogitBoost is not.  Our result for LogitBoost is  a special case of a more general theorem that applies to several other  loss functions that have been proposed for multiclass classification.  
ER  -

APA


Long, P. & Servedio, R.. (2013). Consistency versus Realizable H-Consistency for Multiclass Classification. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):801-809 Available from https://proceedings.mlr.press/v28/long13.html.

Consistency versus Realizable H-Consistency for Multiclass Classification

Abstract

Cite this Paper

Related Material