Scalable Generative Models for Multi-label Learning with Missing Labels

Vikas Jain, Nirbhay Modhe, Piyush Rai
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1636-1644, 2017.

Abstract

We present a scalable, generative framework for multi-label learning with missing labels. Our framework consists of a latent factor model for the binary label matrix, which is coupled with an exposure model to account for label missingness (i.e., whether a zero in the label matrix is indeed a zero or denotes a missing observation). The underlying latent factor model also assumes that the low-dimensional embeddings of each label vector are directly conditioned on the respective feature vector of that example. Our generative framework admits a simple inference procedure, such that the parameter estimation reduces to a sequence of simple weighted least-square regression problems, each of which can be solved easily, efficiently, and in parallel. Moreover, inference can also be performed in an online fashion using mini-batches of training examples, which makes our framework scalable for large data sets, even when using moderate computational resources. We report both quantitative and qualitative results for our framework on several benchmark data sets, comparing it with a number of state-of-the-art methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-jain17a, title = {Scalable Generative Models for Multi-label Learning with Missing Labels}, author = {Vikas Jain and Nirbhay Modhe and Piyush Rai}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {1636--1644}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/jain17a/jain17a.pdf}, url = {https://proceedings.mlr.press/v70/jain17a.html}, abstract = {We present a scalable, generative framework for multi-label learning with missing labels. Our framework consists of a latent factor model for the binary label matrix, which is coupled with an exposure model to account for label missingness (i.e., whether a zero in the label matrix is indeed a zero or denotes a missing observation). The underlying latent factor model also assumes that the low-dimensional embeddings of each label vector are directly conditioned on the respective feature vector of that example. Our generative framework admits a simple inference procedure, such that the parameter estimation reduces to a sequence of simple weighted least-square regression problems, each of which can be solved easily, efficiently, and in parallel. Moreover, inference can also be performed in an online fashion using mini-batches of training examples, which makes our framework scalable for large data sets, even when using moderate computational resources. We report both quantitative and qualitative results for our framework on several benchmark data sets, comparing it with a number of state-of-the-art methods.} }
Endnote
%0 Conference Paper %T Scalable Generative Models for Multi-label Learning with Missing Labels %A Vikas Jain %A Nirbhay Modhe %A Piyush Rai %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-jain17a %I PMLR %P 1636--1644 %U https://proceedings.mlr.press/v70/jain17a.html %V 70 %X We present a scalable, generative framework for multi-label learning with missing labels. Our framework consists of a latent factor model for the binary label matrix, which is coupled with an exposure model to account for label missingness (i.e., whether a zero in the label matrix is indeed a zero or denotes a missing observation). The underlying latent factor model also assumes that the low-dimensional embeddings of each label vector are directly conditioned on the respective feature vector of that example. Our generative framework admits a simple inference procedure, such that the parameter estimation reduces to a sequence of simple weighted least-square regression problems, each of which can be solved easily, efficiently, and in parallel. Moreover, inference can also be performed in an online fashion using mini-batches of training examples, which makes our framework scalable for large data sets, even when using moderate computational resources. We report both quantitative and qualitative results for our framework on several benchmark data sets, comparing it with a number of state-of-the-art methods.
APA
Jain, V., Modhe, N. & Rai, P.. (2017). Scalable Generative Models for Multi-label Learning with Missing Labels. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1636-1644 Available from https://proceedings.mlr.press/v70/jain17a.html.

Related Material