One-Pass AUC Optimization

Wei Gao; Rong Jin; Shenghuo Zhu; Zhi-Hua Zhou

One-Pass AUC Optimization

Wei Gao, Rong Jin, Shenghuo Zhu, Zhi-Hua Zhou

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):906-914, 2013.

Abstract

AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set. In this work, we focus on one-pass AUC optimization that requires only going through the training data once without storing the entire training dataset, where conventional online learning algorithms cannot be applied directly because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second order statistics of training data in memory, resulting a storage requirement independent from the size of training data. To efficiently handle high dimensional data, we develop a randomized algorithm that approximates the covariance matrices by low rank matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-gao13,
  title = 	 {One-Pass AUC Optimization},
  author = 	 {Gao, Wei and Jin, Rong and Zhu, Shenghuo and Zhou, Zhi-Hua},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {906--914},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/gao13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/gao13.html},
  abstract = 	 {AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set. In this work, we focus on one-pass AUC optimization that requires only going through the training data once without storing the entire training dataset, where conventional online learning algorithms cannot be applied directly because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second order statistics of training data in memory, resulting a storage requirement independent from the size of training data. To efficiently handle high dimensional data, we develop a randomized algorithm that approximates the covariance matrices by low rank matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithm.}
}

Endnote

%0 Conference Paper
%T One-Pass AUC Optimization
%A Wei Gao
%A Rong Jin
%A Shenghuo Zhu
%A Zhi-Hua Zhou
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-gao13
%I PMLR
%P 906--914
%U https://proceedings.mlr.press/v28/gao13.html
%V 28
%N 3
%X AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set. In this work, we focus on one-pass AUC optimization that requires only going through the training data once without storing the entire training dataset, where conventional online learning algorithms cannot be applied directly because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second order statistics of training data in memory, resulting a storage requirement independent from the size of training data. To efficiently handle high dimensional data, we develop a randomized algorithm that approximates the covariance matrices by low rank matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithm.

RIS


TY  - CPAPER
TI  - One-Pass AUC Optimization
AU  - Wei Gao
AU  - Rong Jin
AU  - Shenghuo Zhu
AU  - Zhi-Hua Zhou
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-gao13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 906
EP  - 914
L1  - http://proceedings.mlr.press/v28/gao13.pdf
UR  - https://proceedings.mlr.press/v28/gao13.html
AB  - AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set. In this work, we focus on one-pass AUC optimization that requires only going through the training data once without storing the entire training dataset, where conventional online learning algorithms cannot be applied directly because AUC is measured by a sum of losses defined over pairs of instances from different classes. We develop a regression-based algorithm which only needs to maintain the first and second order statistics of training data in memory, resulting a storage requirement independent from the size of training data. To efficiently handle high dimensional data, we develop a randomized algorithm that approximates the covariance matrices by low rank matrices. We verify, both theoretically and empirically, the effectiveness of the proposed algorithm.
ER  -

APA


Gao, W., Jin, R., Zhu, S. & Zhou, Z.. (2013). One-Pass AUC Optimization. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):906-914 Available from https://proceedings.mlr.press/v28/gao13.html.

Related Material

Download PDF