Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach

Zhengdong Lu
; Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, PMLR 2:299-306, 2007.

Abstract

We consider the semi-supervised clustering problem where we know (with varying degree of certainty) that some sample pairs are (or are not) in the same class. Unlike previous efforts in adapting clustering algorithms to incorporate those pairwise relations, our work is based on a discriminative model. We generalize the standard Gaussian process classifier (GPC) to express our classification preference. To use the samples not involved in pairwise relations, we employ the graph kernels (covariance matrix) based on the entire data set. Experiments on a variety of data sets show that our algorithm significantly outperforms several state-of-the-art methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v2-lu07a, title = {Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach}, author = {Zhengdong Lu}, booktitle = {Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics}, pages = {299--306}, year = {2007}, editor = {Marina Meila and Xiaotong Shen}, volume = {2}, series = {Proceedings of Machine Learning Research}, address = {San Juan, Puerto Rico}, month = {21--24 Mar}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v2/lu07a/lu07a.pdf}, url = {http://proceedings.mlr.press/v2/lu07a.html}, abstract = {We consider the semi-supervised clustering problem where we know (with varying degree of certainty) that some sample pairs are (or are not) in the same class. Unlike previous efforts in adapting clustering algorithms to incorporate those pairwise relations, our work is based on a discriminative model. We generalize the standard Gaussian process classifier (GPC) to express our classification preference. To use the samples not involved in pairwise relations, we employ the graph kernels (covariance matrix) based on the entire data set. Experiments on a variety of data sets show that our algorithm significantly outperforms several state-of-the-art methods.} }
Endnote
%0 Conference Paper %T Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach %A Zhengdong Lu %B Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2007 %E Marina Meila %E Xiaotong Shen %F pmlr-v2-lu07a %I PMLR %J Proceedings of Machine Learning Research %P 299--306 %U http://proceedings.mlr.press %V 2 %W PMLR %X We consider the semi-supervised clustering problem where we know (with varying degree of certainty) that some sample pairs are (or are not) in the same class. Unlike previous efforts in adapting clustering algorithms to incorporate those pairwise relations, our work is based on a discriminative model. We generalize the standard Gaussian process classifier (GPC) to express our classification preference. To use the samples not involved in pairwise relations, we employ the graph kernels (covariance matrix) based on the entire data set. Experiments on a variety of data sets show that our algorithm significantly outperforms several state-of-the-art methods.
RIS
TY - CPAPER TI - Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach AU - Zhengdong Lu BT - Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics PY - 2007/03/11 DA - 2007/03/11 ED - Marina Meila ED - Xiaotong Shen ID - pmlr-v2-lu07a PB - PMLR SP - 299 DP - PMLR EP - 306 L1 - http://proceedings.mlr.press/v2/lu07a/lu07a.pdf UR - http://proceedings.mlr.press/v2/lu07a.html AB - We consider the semi-supervised clustering problem where we know (with varying degree of certainty) that some sample pairs are (or are not) in the same class. Unlike previous efforts in adapting clustering algorithms to incorporate those pairwise relations, our work is based on a discriminative model. We generalize the standard Gaussian process classifier (GPC) to express our classification preference. To use the samples not involved in pairwise relations, we employ the graph kernels (covariance matrix) based on the entire data set. Experiments on a variety of data sets show that our algorithm significantly outperforms several state-of-the-art methods. ER -
APA
Lu, Z.. (2007). Semi-supervised Clustering with Pairwise Constraints: A Discriminative Approach. Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, in PMLR 2:299-306

Related Material