S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification

Gautam Dasarathy, Robert Nowak, Xiaojin Zhu
Proceedings of The 28th Conference on Learning Theory, PMLR 40:503-522, 2015.

Abstract

This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S^2 for this task. At each step, S^2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S^2 queries for the label of the vertex that bisects the \em shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S^2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S^2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S^2 algorithm to the theory of nonparametric active learning. In particular, we show that S^2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v40-Dasarathy15, title = {S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification}, author = {Dasarathy, Gautam and Nowak, Robert and Zhu, Xiaojin}, booktitle = {Proceedings of The 28th Conference on Learning Theory}, pages = {503--522}, year = {2015}, editor = {Grünwald, Peter and Hazan, Elad and Kale, Satyen}, volume = {40}, series = {Proceedings of Machine Learning Research}, address = {Paris, France}, month = {03--06 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v40/Dasarathy15.pdf}, url = {https://proceedings.mlr.press/v40/Dasarathy15.html}, abstract = {This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S^2 for this task. At each step, S^2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S^2 queries for the label of the vertex that bisects the \em shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S^2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S^2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S^2 algorithm to the theory of nonparametric active learning. In particular, we show that S^2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.} }
Endnote
%0 Conference Paper %T S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification %A Gautam Dasarathy %A Robert Nowak %A Xiaojin Zhu %B Proceedings of The 28th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2015 %E Peter Grünwald %E Elad Hazan %E Satyen Kale %F pmlr-v40-Dasarathy15 %I PMLR %P 503--522 %U https://proceedings.mlr.press/v40/Dasarathy15.html %V 40 %X This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S^2 for this task. At each step, S^2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S^2 queries for the label of the vertex that bisects the \em shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S^2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S^2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S^2 algorithm to the theory of nonparametric active learning. In particular, we show that S^2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.
RIS
TY - CPAPER TI - S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification AU - Gautam Dasarathy AU - Robert Nowak AU - Xiaojin Zhu BT - Proceedings of The 28th Conference on Learning Theory DA - 2015/06/26 ED - Peter Grünwald ED - Elad Hazan ED - Satyen Kale ID - pmlr-v40-Dasarathy15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 40 SP - 503 EP - 522 L1 - http://proceedings.mlr.press/v40/Dasarathy15.pdf UR - https://proceedings.mlr.press/v40/Dasarathy15.html AB - This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S^2 for this task. At each step, S^2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S^2 queries for the label of the vertex that bisects the \em shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S^2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S^2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S^2 algorithm to the theory of nonparametric active learning. In particular, we show that S^2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems. ER -
APA
Dasarathy, G., Nowak, R. & Zhu, X.. (2015). S2: An Efficient Graph Based Active Learning Algorithm with Application to Nonparametric Classification. Proceedings of The 28th Conference on Learning Theory, in Proceedings of Machine Learning Research 40:503-522 Available from https://proceedings.mlr.press/v40/Dasarathy15.html.

Related Material