Unsupervised Supervised Learning II: Margin-Based Classification without Labels


Krishnakumar Balasubramanian, Pinar Donmez, Guy Lebanon ;
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:137-145, 2011.


Many popular linear classifiers, such as logistic regression, boosting, or SVM, are trained by optimizing margin-based risk functions. Traditionally, these risk functions are computed based on a labeled dataset. We develop a novel technique for estimating such risks using only unlabeled data and knowledge of p(y). We prove that the proposed risk estimator is consistent on high-dimensional datasets and demonstrate it on synthetic and real-world data. In particular, we show how the estimate is used for evaluating classifiers in transfer learning, and for training classifiers using exclusively unlabeled data. [pdf]

Related Material