Toward a General Theory of Online Selective Sampling: Trading Off Mistakes and Queries

Steve Hanneke, Liu Yang
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3997-4005, 2021.

Abstract

While the literature on the theory of pool-based active learning has seen much progress in the past 15 years, and is now fairly mature, much less is known about its cousin problem: online selective sampling. In the stochastic online learning setting, there is a stream of iid data, and the learner is required to predict a label for each instance, and we are interested in the rate of growth of the number of mistakes the learner makes. In the selective sampling variant of this problem, after each prediction, the learner can optionally request to observe the true classification of the point. This introduces a trade-off between the number of these queries and the number of mistakes as a function of the number T of samples in the sequence. This work explores various properties of the optimal trade-off curve, both abstractly (for general VC classes), and more-concretely for several constructed examples that expose important properties of the trade-off.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-hanneke21a, title = { Toward a General Theory of Online Selective Sampling: Trading Off Mistakes and Queries }, author = {Hanneke, Steve and Yang, Liu}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {3997--4005}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/hanneke21a/hanneke21a.pdf}, url = {http://proceedings.mlr.press/v130/hanneke21a.html}, abstract = { While the literature on the theory of pool-based active learning has seen much progress in the past 15 years, and is now fairly mature, much less is known about its cousin problem: online selective sampling. In the stochastic online learning setting, there is a stream of iid data, and the learner is required to predict a label for each instance, and we are interested in the rate of growth of the number of mistakes the learner makes. In the selective sampling variant of this problem, after each prediction, the learner can optionally request to observe the true classification of the point. This introduces a trade-off between the number of these queries and the number of mistakes as a function of the number T of samples in the sequence. This work explores various properties of the optimal trade-off curve, both abstractly (for general VC classes), and more-concretely for several constructed examples that expose important properties of the trade-off. } }
Endnote
%0 Conference Paper %T Toward a General Theory of Online Selective Sampling: Trading Off Mistakes and Queries %A Steve Hanneke %A Liu Yang %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-hanneke21a %I PMLR %P 3997--4005 %U http://proceedings.mlr.press/v130/hanneke21a.html %V 130 %X While the literature on the theory of pool-based active learning has seen much progress in the past 15 years, and is now fairly mature, much less is known about its cousin problem: online selective sampling. In the stochastic online learning setting, there is a stream of iid data, and the learner is required to predict a label for each instance, and we are interested in the rate of growth of the number of mistakes the learner makes. In the selective sampling variant of this problem, after each prediction, the learner can optionally request to observe the true classification of the point. This introduces a trade-off between the number of these queries and the number of mistakes as a function of the number T of samples in the sequence. This work explores various properties of the optimal trade-off curve, both abstractly (for general VC classes), and more-concretely for several constructed examples that expose important properties of the trade-off.
APA
Hanneke, S. & Yang, L.. (2021). Toward a General Theory of Online Selective Sampling: Trading Off Mistakes and Queries . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3997-4005 Available from http://proceedings.mlr.press/v130/hanneke21a.html.

Related Material