Algorithms for Direct 0–1 Loss Optimization in Binary Classification

Tan Nguyen, Scott Sanner
Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1085-1093, 2013.

Abstract

While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers. On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice. In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers. To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-nguyen13a, title = {Algorithms for Direct 0--1 Loss Optimization in Binary Classification}, author = {Nguyen, Tan and Sanner, Scott}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {1085--1093}, year = {2013}, editor = {Dasgupta, Sanjoy and McAllester, David}, volume = {28}, number = {3}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/nguyen13a.pdf}, url = {https://proceedings.mlr.press/v28/nguyen13a.html}, abstract = {While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers. On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice. In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers. To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.} }
Endnote
%0 Conference Paper %T Algorithms for Direct 0–1 Loss Optimization in Binary Classification %A Tan Nguyen %A Scott Sanner %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-nguyen13a %I PMLR %P 1085--1093 %U https://proceedings.mlr.press/v28/nguyen13a.html %V 28 %N 3 %X While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers. On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice. In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers. To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize.
RIS
TY - CPAPER TI - Algorithms for Direct 0–1 Loss Optimization in Binary Classification AU - Tan Nguyen AU - Scott Sanner BT - Proceedings of the 30th International Conference on Machine Learning DA - 2013/05/26 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-nguyen13a PB - PMLR DP - Proceedings of Machine Learning Research VL - 28 IS - 3 SP - 1085 EP - 1093 L1 - http://proceedings.mlr.press/v28/nguyen13a.pdf UR - https://proceedings.mlr.press/v28/nguyen13a.html AB - While convex losses for binary classification are attractive due to the existence of numerous (provably) efficient methods for finding their global optima, they are sensitive to outliers. On the other hand, while the non-convex 0–1 loss is robust to outliers, it is NP-hard to optimize and thus rarely directly optimized in practice. In this paper, however, we do just that: we explore a variety of practical methods for direct (approximate) optimization of the 0–1 loss based on branch and bound search, combinatorial search, and coordinate descent on smooth, differentiable relaxations of 0–1 loss. Empirically, we compare our proposed algorithms to logistic regression, SVM, and the Bayes point machine showing that the proposed 0–1 loss optimization algorithms perform at least as well and offer a clear advantage in the presence of outliers. To this end, we believe this work reiterates the importance of 0–1 loss and its robustness properties while challenging the notion that it is difficult to directly optimize. ER -
APA
Nguyen, T. & Sanner, S.. (2013). Algorithms for Direct 0–1 Loss Optimization in Binary Classification. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1085-1093 Available from https://proceedings.mlr.press/v28/nguyen13a.html.

Related Material