Reducing Label Complexity by Learning From Bags

Sivan Sabato; Nathan Srebro; Naftali Tishby

Reducing Label Complexity by Learning From Bags

Sivan Sabato, Nathan Srebro, Naftali Tishby

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:685-692, 2010.

Abstract

We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.

Cite this Paper

BibTeX


@InProceedings{pmlr-v9-sabato10a,
  title = 	 {Reducing Label Complexity by Learning From Bags},
  author = 	 {Sabato, Sivan and Srebro, Nathan and Tishby, Naftali},
  booktitle = 	 {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {685--692},
  year = 	 {2010},
  editor = 	 {Teh, Yee Whye and Titterington, Mike},
  volume = 	 {9},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Chia Laguna Resort, Sardinia, Italy},
  month = 	 {13--15 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v9/sabato10a/sabato10a.pdf},
  url = 	 {https://proceedings.mlr.press/v9/sabato10a.html},
  abstract = 	 {We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.}
}

Endnote

%0 Conference Paper
%T Reducing Label Complexity by Learning From Bags
%A Sivan Sabato
%A Nathan Srebro
%A Naftali Tishby
%B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2010
%E Yee Whye Teh
%E Mike Titterington	
%F pmlr-v9-sabato10a
%I PMLR
%P 685--692
%U https://proceedings.mlr.press/v9/sabato10a.html
%V 9
%X We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.

RIS


TY  - CPAPER
TI  - Reducing Label Complexity by Learning From Bags
AU  - Sivan Sabato
AU  - Nathan Srebro
AU  - Naftali Tishby
BT  - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
DA  - 2010/03/31
ED  - Yee Whye Teh
ED  - Mike Titterington	
ID  - pmlr-v9-sabato10a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 9
SP  - 685
EP  - 692
L1  - http://proceedings.mlr.press/v9/sabato10a/sabato10a.pdf
UR  - https://proceedings.mlr.press/v9/sabato10a.html
AB  - We consider a supervised learning setting in which the main cost of learning is the number of training labels and one can obtain a single label for a bag of examples, indicating only if a positive example exists in the bag, as in Multi-Instance Learning. We thus propose to create a training sample of bags, and to use the obtained labels to learn to classify individual examples. We provide a theoretical analysis showing how to select the bag size as a function of the problem parameters, and prove that if the original labels are distributed unevenly, the number of required labels drops considerably when learning from bags. We demonstrate that finding a low-error separating hyperplane from bags is feasible in this setting using a simple iterative procedure similar to latent SVM. Experiments on synthetic and real data sets demonstrate the success of the approach.
ER  -

APA


Sabato, S., Srebro, N. & Tishby, N.. (2010). Reducing Label Complexity by Learning From Bags. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:685-692 Available from https://proceedings.mlr.press/v9/sabato10a.html.

Related Material

Download PDF