A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data

Do-kyum Kim, Matthew Der, Lawrence Saul
Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, PMLR 33:484-492, 2014.

Abstract

We investigate a Gaussian latent variable model for semi-supervised learning of linear large margin classifiers. The model’s latent variables encode the signed distance of examples to the separating hyperplane, and we constrain these variables, for both labeled and unlabeled examples, to ensure that the classes are separated by a large margin. Our approach is based on similar intuitions as semi-supervised support vector machines (S3VMs), but these intuitions are formalized in a probabilistic framework. Within this framework we are able to derive an especially simple Expectation-Maximization (EM) algorithm for learning. The algorithm alternates between applying Bayes rule to “fill in” the latent variables (the E-step) and performing an unconstrained least-squares regression to update the weight vector (the M-step). For the best results it is necessary to constrain the unlabeled data to have a similar ratio of positive to negative examples as the labeled data. Within our model this constraint renders exact inference intractable, but we show that a Lyapunov central limit theorem (for sums of independent, but non-identical random variables) provides an excellent approximation to the true posterior distribution. We perform experiments on large-scale text classification and find that our model significantly outperforms existing implementations of S3VMs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v33-kim14a, title = {{A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data}}, author = {Kim, Do-kyum and Der, Matthew and Saul, Lawrence}, booktitle = {Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics}, pages = {484--492}, year = {2014}, editor = {Kaski, Samuel and Corander, Jukka}, volume = {33}, series = {Proceedings of Machine Learning Research}, address = {Reykjavik, Iceland}, month = {22--25 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v33/kim14a.pdf}, url = {https://proceedings.mlr.press/v33/kim14a.html}, abstract = {We investigate a Gaussian latent variable model for semi-supervised learning of linear large margin classifiers. The model’s latent variables encode the signed distance of examples to the separating hyperplane, and we constrain these variables, for both labeled and unlabeled examples, to ensure that the classes are separated by a large margin. Our approach is based on similar intuitions as semi-supervised support vector machines (S3VMs), but these intuitions are formalized in a probabilistic framework. Within this framework we are able to derive an especially simple Expectation-Maximization (EM) algorithm for learning. The algorithm alternates between applying Bayes rule to “fill in” the latent variables (the E-step) and performing an unconstrained least-squares regression to update the weight vector (the M-step). For the best results it is necessary to constrain the unlabeled data to have a similar ratio of positive to negative examples as the labeled data. Within our model this constraint renders exact inference intractable, but we show that a Lyapunov central limit theorem (for sums of independent, but non-identical random variables) provides an excellent approximation to the true posterior distribution. We perform experiments on large-scale text classification and find that our model significantly outperforms existing implementations of S3VMs.} }
Endnote
%0 Conference Paper %T A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data %A Do-kyum Kim %A Matthew Der %A Lawrence Saul %B Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2014 %E Samuel Kaski %E Jukka Corander %F pmlr-v33-kim14a %I PMLR %P 484--492 %U https://proceedings.mlr.press/v33/kim14a.html %V 33 %X We investigate a Gaussian latent variable model for semi-supervised learning of linear large margin classifiers. The model’s latent variables encode the signed distance of examples to the separating hyperplane, and we constrain these variables, for both labeled and unlabeled examples, to ensure that the classes are separated by a large margin. Our approach is based on similar intuitions as semi-supervised support vector machines (S3VMs), but these intuitions are formalized in a probabilistic framework. Within this framework we are able to derive an especially simple Expectation-Maximization (EM) algorithm for learning. The algorithm alternates between applying Bayes rule to “fill in” the latent variables (the E-step) and performing an unconstrained least-squares regression to update the weight vector (the M-step). For the best results it is necessary to constrain the unlabeled data to have a similar ratio of positive to negative examples as the labeled data. Within our model this constraint renders exact inference intractable, but we show that a Lyapunov central limit theorem (for sums of independent, but non-identical random variables) provides an excellent approximation to the true posterior distribution. We perform experiments on large-scale text classification and find that our model significantly outperforms existing implementations of S3VMs.
RIS
TY - CPAPER TI - A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data AU - Do-kyum Kim AU - Matthew Der AU - Lawrence Saul BT - Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics DA - 2014/04/02 ED - Samuel Kaski ED - Jukka Corander ID - pmlr-v33-kim14a PB - PMLR DP - Proceedings of Machine Learning Research VL - 33 SP - 484 EP - 492 L1 - http://proceedings.mlr.press/v33/kim14a.pdf UR - https://proceedings.mlr.press/v33/kim14a.html AB - We investigate a Gaussian latent variable model for semi-supervised learning of linear large margin classifiers. The model’s latent variables encode the signed distance of examples to the separating hyperplane, and we constrain these variables, for both labeled and unlabeled examples, to ensure that the classes are separated by a large margin. Our approach is based on similar intuitions as semi-supervised support vector machines (S3VMs), but these intuitions are formalized in a probabilistic framework. Within this framework we are able to derive an especially simple Expectation-Maximization (EM) algorithm for learning. The algorithm alternates between applying Bayes rule to “fill in” the latent variables (the E-step) and performing an unconstrained least-squares regression to update the weight vector (the M-step). For the best results it is necessary to constrain the unlabeled data to have a similar ratio of positive to negative examples as the labeled data. Within our model this constraint renders exact inference intractable, but we show that a Lyapunov central limit theorem (for sums of independent, but non-identical random variables) provides an excellent approximation to the true posterior distribution. We perform experiments on large-scale text classification and find that our model significantly outperforms existing implementations of S3VMs. ER -
APA
Kim, D., Der, M. & Saul, L.. (2014). A Gaussian Latent Variable Model for Large Margin Classification of Labeled and Unlabeled Data. Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 33:484-492 Available from https://proceedings.mlr.press/v33/kim14a.html.

Related Material