Algorithms for Direct 0–1 Loss Optimization in Binary Classification

Tan Nguyen; Scott Sanner

Algorithms for Direct 0–1 Loss Optimization in Binary Classification

Tan Nguyen, Scott Sanner

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1085-1093, 2013.

Abstract

While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers. On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice. In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers. To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-nguyen13a,
  title = 	 {Algorithms for Direct 0--1 Loss Optimization in Binary Classification},
  author = 	 {Nguyen, Tan and Sanner, Scott},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {1085--1093},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/nguyen13a.pdf},
  url = 	 {https://proceedings.mlr.press/v28/nguyen13a.html},
  abstract = 	 {While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers.  On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice.  In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers.  To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.}
}

Endnote

%0 Conference Paper
%T Algorithms for Direct 0–1 Loss Optimization in Binary Classification
%A Tan Nguyen
%A Scott Sanner
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-nguyen13a
%I PMLR
%P 1085--1093
%U https://proceedings.mlr.press/v28/nguyen13a.html
%V 28
%N 3
%X While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers.  On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice.  In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers.  To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.

RIS


TY  - CPAPER
TI  - Algorithms for Direct 0–1 Loss Optimization in Binary Classification
AU  - Tan Nguyen
AU  - Scott Sanner
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-nguyen13a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 1085
EP  - 1093
L1  - http://proceedings.mlr.press/v28/nguyen13a.pdf
UR  - https://proceedings.mlr.press/v28/nguyen13a.html
AB  - While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers.  On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice.  In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers.  To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.
ER  -

APA


Nguyen, T. & Sanner, S.. (2013). Algorithms for Direct 0–1 Loss Optimization in Binary Classification. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1085-1093 Available from https://proceedings.mlr.press/v28/nguyen13a.html.

Related Material

Download PDF