Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses

Shivani Agarwal
Proceedings of the 26th Annual Conference on Learning Theory, PMLR 30:338-353, 2013.

Abstract

The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be formulated as a pairwise classification regret, which in turn can be upper bounded using usual regret bounds for binary classification. Recently, Kotlowski et al. (2011) showed AUC regret bounds in terms of the regret associated with ‘balanced’ versions of the standard (non-pairwise) logistic and exponential losses. In this paper, we obtain such (non-pairwise) surrogate regret bounds for the AUC in terms of a broad class of proper (composite) losses that we term \emphstrongly proper. Our proof technique is considerably simpler than that of Kotlowski et al. (2011), and relies on properties of proper (composite) losses as elucidated recently by Reid and Williamson (2009, 2010, 2011) and others. Our result yields explicit surrogate bounds (with no hidden balancing terms) in terms of a variety of strongly proper losses, including for example logistic, exponential, squared and squared hinge losses. An important consequence is that standard algorithms minimizing a (non-pairwise) strongly proper loss, such as logistic regression and boosting algorithms (assuming a universal function class and appropriate regularization), are in fact AUC-consistent; moreover, our results allow us to quantify the AUC regret in terms of the corresponding surrogate regret. We also obtain tighter surrogate regret bounds under certain low-noise conditions via a recent result of Clémen\con and Robbiano (2011).

Cite this Paper


BibTeX
@InProceedings{pmlr-v30-Agarwal13, title = {Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses}, author = {Agarwal, Shivani}, booktitle = {Proceedings of the 26th Annual Conference on Learning Theory}, pages = {338--353}, year = {2013}, editor = {Shalev-Shwartz, Shai and Steinwart, Ingo}, volume = {30}, series = {Proceedings of Machine Learning Research}, address = {Princeton, NJ, USA}, month = {12--14 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v30/Agarwal13.pdf}, url = {https://proceedings.mlr.press/v30/Agarwal13.html}, abstract = {The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be formulated as a pairwise classification regret, which in turn can be upper bounded using usual regret bounds for binary classification. Recently, Kotlowski et al. (2011) showed AUC regret bounds in terms of the regret associated with ‘balanced’ versions of the standard (non-pairwise) logistic and exponential losses. In this paper, we obtain such (non-pairwise) surrogate regret bounds for the AUC in terms of a broad class of proper (composite) losses that we term \emphstrongly proper. Our proof technique is considerably simpler than that of Kotlowski et al. (2011), and relies on properties of proper (composite) losses as elucidated recently by Reid and Williamson (2009, 2010, 2011) and others. Our result yields explicit surrogate bounds (with no hidden balancing terms) in terms of a variety of strongly proper losses, including for example logistic, exponential, squared and squared hinge losses. An important consequence is that standard algorithms minimizing a (non-pairwise) strongly proper loss, such as logistic regression and boosting algorithms (assuming a universal function class and appropriate regularization), are in fact AUC-consistent; moreover, our results allow us to quantify the AUC regret in terms of the corresponding surrogate regret. We also obtain tighter surrogate regret bounds under certain low-noise conditions via a recent result of Clémen\con and Robbiano (2011).} }
Endnote
%0 Conference Paper %T Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses %A Shivani Agarwal %B Proceedings of the 26th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2013 %E Shai Shalev-Shwartz %E Ingo Steinwart %F pmlr-v30-Agarwal13 %I PMLR %P 338--353 %U https://proceedings.mlr.press/v30/Agarwal13.html %V 30 %X The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be formulated as a pairwise classification regret, which in turn can be upper bounded using usual regret bounds for binary classification. Recently, Kotlowski et al. (2011) showed AUC regret bounds in terms of the regret associated with ‘balanced’ versions of the standard (non-pairwise) logistic and exponential losses. In this paper, we obtain such (non-pairwise) surrogate regret bounds for the AUC in terms of a broad class of proper (composite) losses that we term \emphstrongly proper. Our proof technique is considerably simpler than that of Kotlowski et al. (2011), and relies on properties of proper (composite) losses as elucidated recently by Reid and Williamson (2009, 2010, 2011) and others. Our result yields explicit surrogate bounds (with no hidden balancing terms) in terms of a variety of strongly proper losses, including for example logistic, exponential, squared and squared hinge losses. An important consequence is that standard algorithms minimizing a (non-pairwise) strongly proper loss, such as logistic regression and boosting algorithms (assuming a universal function class and appropriate regularization), are in fact AUC-consistent; moreover, our results allow us to quantify the AUC regret in terms of the corresponding surrogate regret. We also obtain tighter surrogate regret bounds under certain low-noise conditions via a recent result of Clémen\con and Robbiano (2011).
RIS
TY - CPAPER TI - Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses AU - Shivani Agarwal BT - Proceedings of the 26th Annual Conference on Learning Theory DA - 2013/06/13 ED - Shai Shalev-Shwartz ED - Ingo Steinwart ID - pmlr-v30-Agarwal13 PB - PMLR DP - Proceedings of Machine Learning Research VL - 30 SP - 338 EP - 353 L1 - http://proceedings.mlr.press/v30/Agarwal13.pdf UR - https://proceedings.mlr.press/v30/Agarwal13.html AB - The area under the ROC curve (AUC) is a widely used performance measure in machine learning, and has been widely studied in recent years particularly in the context of bipartite ranking. A dominant theoretical and algorithmic framework for AUC optimization/bipartite ranking has been to reduce the problem to pairwise classification; in particular, it is well known that the AUC regret can be formulated as a pairwise classification regret, which in turn can be upper bounded using usual regret bounds for binary classification. Recently, Kotlowski et al. (2011) showed AUC regret bounds in terms of the regret associated with ‘balanced’ versions of the standard (non-pairwise) logistic and exponential losses. In this paper, we obtain such (non-pairwise) surrogate regret bounds for the AUC in terms of a broad class of proper (composite) losses that we term \emphstrongly proper. Our proof technique is considerably simpler than that of Kotlowski et al. (2011), and relies on properties of proper (composite) losses as elucidated recently by Reid and Williamson (2009, 2010, 2011) and others. Our result yields explicit surrogate bounds (with no hidden balancing terms) in terms of a variety of strongly proper losses, including for example logistic, exponential, squared and squared hinge losses. An important consequence is that standard algorithms minimizing a (non-pairwise) strongly proper loss, such as logistic regression and boosting algorithms (assuming a universal function class and appropriate regularization), are in fact AUC-consistent; moreover, our results allow us to quantify the AUC regret in terms of the corresponding surrogate regret. We also obtain tighter surrogate regret bounds under certain low-noise conditions via a recent result of Clémen\con and Robbiano (2011). ER -
APA
Agarwal, S.. (2013). Surrogate Regret Bounds for the Area Under the ROC Curve via Strongly Proper Losses. Proceedings of the 26th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 30:338-353 Available from https://proceedings.mlr.press/v30/Agarwal13.html.

Related Material