The Coherent Loss Function for Classification

Wenzhuo Yang, Melvyn Sim, Huan Xu
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(1):37-45, 2014.

Abstract

A prediction rule in binary classification that aims to achieve the lowest probability of misclassification involves minimizing over a non-convex, 0-1 loss function, which is typically a computationally intractable optimization problem. To address the intractability, previous methods consider minimizing the cumulative loss – the sum of convex surrogates of the 0-1 loss of each sample. In this paper, we revisit this paradigm and develop instead an axiomatic framework by proposing a set of salient properties on functions for binary classification and then propose the coherent loss approach, which is a tractable upper-bound of the empirical classification error over the entire sample set. We show that the proposed approach yields a strictly tighter approximation to the empirical classification error than any convex cumulative loss approach while preserving the convexity of the underlying optimization problem, and this approach for binary classification also has a robustness interpretation which builds a connection to robust SVMs. The experimental results show that our approach outperforms the standard SVM when additional constraints are imposed.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-yanga14, title = {The Coherent Loss Function for Classification}, author = {Yang, Wenzhuo and Sim, Melvyn and Xu, Huan}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {37--45}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {1}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/yanga14.pdf}, url = {https://proceedings.mlr.press/v32/yanga14.html}, abstract = {A prediction rule in binary classification that aims to achieve the lowest probability of misclassification involves minimizing over a non-convex, 0-1 loss function, which is typically a computationally intractable optimization problem. To address the intractability, previous methods consider minimizing the cumulative loss – the sum of convex surrogates of the 0-1 loss of each sample. In this paper, we revisit this paradigm and develop instead an axiomatic framework by proposing a set of salient properties on functions for binary classification and then propose the coherent loss approach, which is a tractable upper-bound of the empirical classification error over the entire sample set. We show that the proposed approach yields a strictly tighter approximation to the empirical classification error than any convex cumulative loss approach while preserving the convexity of the underlying optimization problem, and this approach for binary classification also has a robustness interpretation which builds a connection to robust SVMs. The experimental results show that our approach outperforms the standard SVM when additional constraints are imposed.} }
Endnote
%0 Conference Paper %T The Coherent Loss Function for Classification %A Wenzhuo Yang %A Melvyn Sim %A Huan Xu %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-yanga14 %I PMLR %P 37--45 %U https://proceedings.mlr.press/v32/yanga14.html %V 32 %N 1 %X A prediction rule in binary classification that aims to achieve the lowest probability of misclassification involves minimizing over a non-convex, 0-1 loss function, which is typically a computationally intractable optimization problem. To address the intractability, previous methods consider minimizing the cumulative loss – the sum of convex surrogates of the 0-1 loss of each sample. In this paper, we revisit this paradigm and develop instead an axiomatic framework by proposing a set of salient properties on functions for binary classification and then propose the coherent loss approach, which is a tractable upper-bound of the empirical classification error over the entire sample set. We show that the proposed approach yields a strictly tighter approximation to the empirical classification error than any convex cumulative loss approach while preserving the convexity of the underlying optimization problem, and this approach for binary classification also has a robustness interpretation which builds a connection to robust SVMs. The experimental results show that our approach outperforms the standard SVM when additional constraints are imposed.
RIS
TY - CPAPER TI - The Coherent Loss Function for Classification AU - Wenzhuo Yang AU - Melvyn Sim AU - Huan Xu BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/01/27 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-yanga14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 1 SP - 37 EP - 45 L1 - http://proceedings.mlr.press/v32/yanga14.pdf UR - https://proceedings.mlr.press/v32/yanga14.html AB - A prediction rule in binary classification that aims to achieve the lowest probability of misclassification involves minimizing over a non-convex, 0-1 loss function, which is typically a computationally intractable optimization problem. To address the intractability, previous methods consider minimizing the cumulative loss – the sum of convex surrogates of the 0-1 loss of each sample. In this paper, we revisit this paradigm and develop instead an axiomatic framework by proposing a set of salient properties on functions for binary classification and then propose the coherent loss approach, which is a tractable upper-bound of the empirical classification error over the entire sample set. We show that the proposed approach yields a strictly tighter approximation to the empirical classification error than any convex cumulative loss approach while preserving the convexity of the underlying optimization problem, and this approach for binary classification also has a robustness interpretation which builds a connection to robust SVMs. The experimental results show that our approach outperforms the standard SVM when additional constraints are imposed. ER -
APA
Yang, W., Sim, M. & Xu, H.. (2014). The Coherent Loss Function for Classification. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(1):37-45 Available from https://proceedings.mlr.press/v32/yanga14.html.

Related Material