Reducing Label Complexity by Learning From Bags

Sivan Sabato, Nathan Srebro, Naftali Tishby
Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:685-692, 2010.

Abstract

We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v9-sabato10a, title = {Reducing Label Complexity by Learning From Bags}, author = {Sabato, Sivan and Srebro, Nathan and Tishby, Naftali}, booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics}, pages = {685--692}, year = {2010}, editor = {Teh, Yee Whye and Titterington, Mike}, volume = {9}, series = {Proceedings of Machine Learning Research}, address = {Chia Laguna Resort, Sardinia, Italy}, month = {13--15 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v9/sabato10a/sabato10a.pdf}, url = {https://proceedings.mlr.press/v9/sabato10a.html}, abstract = {We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.} }
Endnote
%0 Conference Paper %T Reducing Label Complexity by Learning From Bags %A Sivan Sabato %A Nathan Srebro %A Naftali Tishby %B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2010 %E Yee Whye Teh %E Mike Titterington %F pmlr-v9-sabato10a %I PMLR %P 685--692 %U https://proceedings.mlr.press/v9/sabato10a.html %V 9 %X We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.
RIS
TY - CPAPER TI - Reducing Label Complexity by Learning From Bags AU - Sivan Sabato AU - Nathan Srebro AU - Naftali Tishby BT - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics DA - 2010/03/31 ED - Yee Whye Teh ED - Mike Titterington ID - pmlr-v9-sabato10a PB - PMLR DP - Proceedings of Machine Learning Research VL - 9 SP - 685 EP - 692 L1 - http://proceedings.mlr.press/v9/sabato10a/sabato10a.pdf UR - https://proceedings.mlr.press/v9/sabato10a.html AB - We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach. ER -
APA
Sabato, S., Srebro, N. & Tishby, N.. (2010). Reducing Label Complexity by Learning From Bags. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:685-692 Available from https://proceedings.mlr.press/v9/sabato10a.html.

Related Material