Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking

Wei-Yuan Shen; Hsuan-Tien Lin

Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking

Wei-Yuan Shen, Hsuan-Tien Lin

Proceedings of the 5th Asian Conference on Machine Learning, PMLR 29:388-403, 2013.

Abstract

Bipartite ranking is a fundamental ranking problem that learns to order relevant instances ahead of irrelevant ones. One major approach for bipartite ranking, called the pair-wise approach, tackles an equivalent binary classification problem of whether one instance out of a pair of instances should be ranked higher than the other. Nevertheless, the number of instance pairs constructed from the input data could be quadratic to the size of the input data, which makes pair-wise ranking generally infeasible on large-scale data sets. Another major approach for bipartite ranking, called the point-wise approach, directly solves a binary classification problem between relevant and irrelevant instance points. This approach is feasible for large-scale data sets, but the resulting ranking performance can be inferior. That is, it is difficult to conduct bipartite ranking accurately and efficiently at the same time. In this paper, we develop a novel scheme within the pair-wise approach to conduct bipartite ranking efficiently. The scheme, called Active Sampling, is inspired from the rich field of active learning and can reach a competitive ranking performance while focusing only on a small subset of the many pairs during training. Moreover, we propose a general Combined Ranking and Classification (CRC) framework to accurately conduct bipartite ranking. The framework unifies point-wise and pair-wise approaches and is simply based on the idea of treating each instance point as a pseudo-pair. Experiments on 14 real- word large-scale data sets demonstrate that the proposed algorithm of Active Sampling within CRC, when coupled with a linear Support Vector Machine, usually outperforms state-of-the-art point-wise and pair-wise ranking approaches in terms of both accuracy and efficiency.

Cite this Paper

BibTeX


@InProceedings{pmlr-v29-Shen13,
  title = 	 {Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking},
  author = 	 {Shen, Wei-Yuan and Lin, Hsuan-Tien},
  booktitle = 	 {Proceedings of the 5th Asian Conference on Machine Learning},
  pages = 	 {388--403},
  year = 	 {2013},
  editor = 	 {Ong, Cheng Soon and Ho, Tu Bao},
  volume = 	 {29},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Australian National University, Canberra, Australia},
  month = 	 {13--15 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v29/Shen13.pdf},
  url = 	 {https://proceedings.mlr.press/v29/Shen13.html},
  abstract = 	 {Bipartite ranking is a fundamental ranking problem that learns to order relevant instances ahead of irrelevant ones. One major approach for bipartite ranking, called the pair-wise approach, tackles an equivalent binary classification problem of whether one instance out of a pair of instances should be ranked higher than the other. Nevertheless, the number of instance pairs constructed from the input data could be quadratic to the size of the input data, which makes pair-wise ranking generally infeasible on large-scale data sets. Another major approach for bipartite ranking, called the point-wise approach, directly solves a binary classification problem between relevant and irrelevant instance points. This approach is feasible for large-scale data sets, but the resulting ranking performance can be inferior. That is, it is difficult to conduct bipartite ranking accurately and efficiently at the same time. In this paper, we develop a novel scheme within the pair-wise approach to conduct bipartite ranking efficiently. The scheme, called Active Sampling, is inspired from the rich field of active learning and can reach a competitive ranking performance while focusing only on a small subset of the many pairs during training. Moreover, we propose a general Combined Ranking and Classification (CRC) framework to accurately conduct bipartite ranking. The framework unifies point-wise and pair-wise approaches and is simply based on the idea of treating each instance point as a pseudo-pair. Experiments on 14 real- word large-scale data sets demonstrate that the proposed algorithm of Active Sampling within CRC, when coupled with a linear Support Vector Machine, usually outperforms state-of-the-art point-wise and pair-wise ranking approaches in terms of both accuracy and efficiency.}
}

Endnote

%0 Conference Paper
%T Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking
%A Wei-Yuan Shen
%A Hsuan-Tien Lin
%B Proceedings of the 5th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Cheng Soon Ong
%E Tu Bao Ho	
%F pmlr-v29-Shen13
%I PMLR
%P 388--403
%U https://proceedings.mlr.press/v29/Shen13.html
%V 29
%X Bipartite ranking is a fundamental ranking problem that learns to order relevant instances ahead of irrelevant ones. One major approach for bipartite ranking, called the pair-wise approach, tackles an equivalent binary classification problem of whether one instance out of a pair of instances should be ranked higher than the other. Nevertheless, the number of instance pairs constructed from the input data could be quadratic to the size of the input data, which makes pair-wise ranking generally infeasible on large-scale data sets. Another major approach for bipartite ranking, called the point-wise approach, directly solves a binary classification problem between relevant and irrelevant instance points. This approach is feasible for large-scale data sets, but the resulting ranking performance can be inferior. That is, it is difficult to conduct bipartite ranking accurately and efficiently at the same time. In this paper, we develop a novel scheme within the pair-wise approach to conduct bipartite ranking efficiently. The scheme, called Active Sampling, is inspired from the rich field of active learning and can reach a competitive ranking performance while focusing only on a small subset of the many pairs during training. Moreover, we propose a general Combined Ranking and Classification (CRC) framework to accurately conduct bipartite ranking. The framework unifies point-wise and pair-wise approaches and is simply based on the idea of treating each instance point as a pseudo-pair. Experiments on 14 real- word large-scale data sets demonstrate that the proposed algorithm of Active Sampling within CRC, when coupled with a linear Support Vector Machine, usually outperforms state-of-the-art point-wise and pair-wise ranking approaches in terms of both accuracy and efficiency.

RIS


TY  - CPAPER
TI  - Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking
AU  - Wei-Yuan Shen
AU  - Hsuan-Tien Lin
BT  - Proceedings of the 5th Asian Conference on Machine Learning
DA  - 2013/10/21
ED  - Cheng Soon Ong
ED  - Tu Bao Ho	
ID  - pmlr-v29-Shen13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 29
SP  - 388
EP  - 403
L1  - http://proceedings.mlr.press/v29/Shen13.pdf
UR  - https://proceedings.mlr.press/v29/Shen13.html
AB  - Bipartite ranking is a fundamental ranking problem that learns to order relevant instances ahead of irrelevant ones. One major approach for bipartite ranking, called the pair-wise approach, tackles an equivalent binary classification problem of whether one instance out of a pair of instances should be ranked higher than the other. Nevertheless, the number of instance pairs constructed from the input data could be quadratic to the size of the input data, which makes pair-wise ranking generally infeasible on large-scale data sets. Another major approach for bipartite ranking, called the point-wise approach, directly solves a binary classification problem between relevant and irrelevant instance points. This approach is feasible for large-scale data sets, but the resulting ranking performance can be inferior. That is, it is difficult to conduct bipartite ranking accurately and efficiently at the same time. In this paper, we develop a novel scheme within the pair-wise approach to conduct bipartite ranking efficiently. The scheme, called Active Sampling, is inspired from the rich field of active learning and can reach a competitive ranking performance while focusing only on a small subset of the many pairs during training. Moreover, we propose a general Combined Ranking and Classification (CRC) framework to accurately conduct bipartite ranking. The framework unifies point-wise and pair-wise approaches and is simply based on the idea of treating each instance point as a pseudo-pair. Experiments on 14 real- word large-scale data sets demonstrate that the proposed algorithm of Active Sampling within CRC, when coupled with a linear Support Vector Machine, usually outperforms state-of-the-art point-wise and pair-wise ranking approaches in terms of both accuracy and efficiency.
ER  -

APA


Shen, W. & Lin, H.. (2013). Active Sampling of Pairs and Points for Large-scale Linear Bipartite Ranking. Proceedings of the 5th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 29:388-403 Available from https://proceedings.mlr.press/v29/Shen13.html.

Related Material

Download PDF