Unsupervised Supervised Learning II: Margin-Based Classification without Labels

Krishnakumar Balasubramanian; Pinar Donmez; Guy Lebanon

Unsupervised Supervised Learning II: Margin-Based Classification without Labels

Krishnakumar Balasubramanian, Pinar Donmez, Guy Lebanon

Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:137-145, 2011.

Abstract

Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing margin-based risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledge of $p(y)$. We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers using exclusively unlabeled data.

Cite this Paper

BibTeX


@InProceedings{pmlr-v15-balasubramanian11a,
  title = 	 {Unsupervised Supervised Learning II: Margin-Based Classification without Labels},
  author = 	 {Balasubramanian, Krishnakumar and Donmez, Pinar and Lebanon, Guy},
  booktitle = 	 {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {137--145},
  year = 	 {2011},
  editor = 	 {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav},
  volume = 	 {15},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Fort Lauderdale, FL, USA},
  month = 	 {11--13 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v15/balasubramanian11a/balasubramanian11a.pdf},
  url = 	 {https://proceedings.mlr.press/v15/balasubramanian11a.html},
  abstract = 	 {Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing margin-based risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledge of $p(y)$. We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers using exclusively unlabeled data.}
}

Endnote

%0 Conference Paper
%T Unsupervised Supervised Learning II: Margin-Based Classification without Labels
%A Krishnakumar Balasubramanian
%A Pinar Donmez
%A Guy Lebanon
%B Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2011
%E Geoffrey Gordon
%E David Dunson
%E Miroslav Dudík	
%F pmlr-v15-balasubramanian11a
%I PMLR
%P 137--145
%U https://proceedings.mlr.press/v15/balasubramanian11a.html
%V 15
%X Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing margin-based risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledge of $p(y)$. We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers using exclusively unlabeled data.

RIS


TY  - CPAPER
TI  - Unsupervised Supervised Learning II: Margin-Based Classification without Labels
AU  - Krishnakumar Balasubramanian
AU  - Pinar Donmez
AU  - Guy Lebanon
BT  - Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics
DA  - 2011/06/14
ED  - Geoffrey Gordon
ED  - David Dunson
ED  - Miroslav Dudík	
ID  - pmlr-v15-balasubramanian11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 15
SP  - 137
EP  - 145
L1  - http://proceedings.mlr.press/v15/balasubramanian11a/balasubramanian11a.pdf
UR  - https://proceedings.mlr.press/v15/balasubramanian11a.html
AB  - Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing margin-based risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledge of $p(y)$. We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers using exclusively unlabeled data.
ER  -

APA


Balasubramanian, K., Donmez, P. & Lebanon, G.. (2011). Unsupervised Supervised Learning II: Margin-Based Classification without Labels. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 15:137-145 Available from https://proceedings.mlr.press/v15/balasubramanian11a.html.

Related Material

Download PDF