Hierarchical Label Queries with Data-Dependent Partitions

Samory Kpotufe, Ruth Urner, Shai Ben-David
; Proceedings of The 28th Conference on Learning Theory, PMLR 40:1176-1189, 2015.

Abstract

Given a joint distribution P_X, Y over a space \Xcal and a label set \Ycal=\braces0, 1, we consider the problem of recovering the labels of an unlabeled sample with as few label queries as possible. The recovered labels can be passed to a passive learner, thus turning the procedure into an active learning approach. We analyze a family of labeling procedures based on a hierarchical clustering of the data. While such labeling procedures have been studied in the past, we provide a new parametrization of P_X, Y that captures their behavior in general low-noise settings, and which accounts for data-dependent clustering, thus providing new theoretical underpinning to practically used tools.

Cite this Paper


BibTeX
@InProceedings{pmlr-v40-Kpotufe15, title = {Hierarchical Label Queries with Data-Dependent Partitions}, author = {Samory Kpotufe and Ruth Urner and Shai Ben-David}, booktitle = {Proceedings of The 28th Conference on Learning Theory}, pages = {1176--1189}, year = {2015}, editor = {Peter Grünwald and Elad Hazan and Satyen Kale}, volume = {40}, series = {Proceedings of Machine Learning Research}, address = {Paris, France}, month = {03--06 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v40/Kpotufe15.pdf}, url = {http://proceedings.mlr.press/v40/Kpotufe15.html}, abstract = {Given a joint distribution P_X, Y over a space \Xcal and a label set \Ycal=\braces0, 1, we consider the problem of recovering the labels of an unlabeled sample with as few label queries as possible. The recovered labels can be passed to a passive learner, thus turning the procedure into an active learning approach. We analyze a family of labeling procedures based on a hierarchical clustering of the data. While such labeling procedures have been studied in the past, we provide a new parametrization of P_X, Y that captures their behavior in general low-noise settings, and which accounts for data-dependent clustering, thus providing new theoretical underpinning to practically used tools. } }
Endnote
%0 Conference Paper %T Hierarchical Label Queries with Data-Dependent Partitions %A Samory Kpotufe %A Ruth Urner %A Shai Ben-David %B Proceedings of The 28th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2015 %E Peter Grünwald %E Elad Hazan %E Satyen Kale %F pmlr-v40-Kpotufe15 %I PMLR %J Proceedings of Machine Learning Research %P 1176--1189 %U http://proceedings.mlr.press %V 40 %W PMLR %X Given a joint distribution P_X, Y over a space \Xcal and a label set \Ycal=\braces0, 1, we consider the problem of recovering the labels of an unlabeled sample with as few label queries as possible. The recovered labels can be passed to a passive learner, thus turning the procedure into an active learning approach. We analyze a family of labeling procedures based on a hierarchical clustering of the data. While such labeling procedures have been studied in the past, we provide a new parametrization of P_X, Y that captures their behavior in general low-noise settings, and which accounts for data-dependent clustering, thus providing new theoretical underpinning to practically used tools.
RIS
TY - CPAPER TI - Hierarchical Label Queries with Data-Dependent Partitions AU - Samory Kpotufe AU - Ruth Urner AU - Shai Ben-David BT - Proceedings of The 28th Conference on Learning Theory PY - 2015/06/26 DA - 2015/06/26 ED - Peter Grünwald ED - Elad Hazan ED - Satyen Kale ID - pmlr-v40-Kpotufe15 PB - PMLR SP - 1176 DP - PMLR EP - 1189 L1 - http://proceedings.mlr.press/v40/Kpotufe15.pdf UR - http://proceedings.mlr.press/v40/Kpotufe15.html AB - Given a joint distribution P_X, Y over a space \Xcal and a label set \Ycal=\braces0, 1, we consider the problem of recovering the labels of an unlabeled sample with as few label queries as possible. The recovered labels can be passed to a passive learner, thus turning the procedure into an active learning approach. We analyze a family of labeling procedures based on a hierarchical clustering of the data. While such labeling procedures have been studied in the past, we provide a new parametrization of P_X, Y that captures their behavior in general low-noise settings, and which accounts for data-dependent clustering, thus providing new theoretical underpinning to practically used tools. ER -
APA
Kpotufe, S., Urner, R. & Ben-David, S.. (2015). Hierarchical Label Queries with Data-Dependent Partitions. Proceedings of The 28th Conference on Learning Theory, in PMLR 40:1176-1189

Related Material