Bounded Memory Active Learning through Enriched Queries

Max Hopkins, Daniel Kane, Shachar Lovett, Michal Moshkovitz
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2358-2387, 2021.

Abstract

The explosive growth of easily-accessible unlabeled data has lead to growing interest in \emph{active learning}, a paradigm in which data-hungry learning algorithms adaptively select informative examples in order to lower prohibitively expensive labeling costs. Unfortunately, in standard worst-case models of learning, the active setting often provides no improvement over non-adaptive algorithms. To combat this, a series of recent works have considered a model in which the learner may ask \emph{enriched} queries beyond labels. While such models have seen success in drastically lowering label costs, they tend to come at the expense of requiring large amounts of memory. In this work, we study what families of classifiers can be learned in \emph{bounded memory}. To this end, we introduce a novel streaming-variant of enriched-query active learning along with a natural combinatorial parameter called \emph{lossless sample compression} that is sufficient for learning not only with bounded memory, but in a query-optimal and computationally efficient manner as well. Finally, we give three fundamental examples of classifier families with small, easy to compute lossless compression schemes when given access to basic enriched queries: axis-aligned rectangles, decision trees, and halfspaces in two dimensions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-hopkins21a, title = {Bounded Memory Active Learning through Enriched Queries}, author = {Hopkins, Max and Kane, Daniel and Lovett, Shachar and Moshkovitz, Michal}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2358--2387}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/hopkins21a/hopkins21a.pdf}, url = {https://proceedings.mlr.press/v134/hopkins21a.html}, abstract = {The explosive growth of easily-accessible unlabeled data has lead to growing interest in \emph{active learning}, a paradigm in which data-hungry learning algorithms adaptively select informative examples in order to lower prohibitively expensive labeling costs. Unfortunately, in standard worst-case models of learning, the active setting often provides no improvement over non-adaptive algorithms. To combat this, a series of recent works have considered a model in which the learner may ask \emph{enriched} queries beyond labels. While such models have seen success in drastically lowering label costs, they tend to come at the expense of requiring large amounts of memory. In this work, we study what families of classifiers can be learned in \emph{bounded memory}. To this end, we introduce a novel streaming-variant of enriched-query active learning along with a natural combinatorial parameter called \emph{lossless sample compression} that is sufficient for learning not only with bounded memory, but in a query-optimal and computationally efficient manner as well. Finally, we give three fundamental examples of classifier families with small, easy to compute lossless compression schemes when given access to basic enriched queries: axis-aligned rectangles, decision trees, and halfspaces in two dimensions.} }
Endnote
%0 Conference Paper %T Bounded Memory Active Learning through Enriched Queries %A Max Hopkins %A Daniel Kane %A Shachar Lovett %A Michal Moshkovitz %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-hopkins21a %I PMLR %P 2358--2387 %U https://proceedings.mlr.press/v134/hopkins21a.html %V 134 %X The explosive growth of easily-accessible unlabeled data has lead to growing interest in \emph{active learning}, a paradigm in which data-hungry learning algorithms adaptively select informative examples in order to lower prohibitively expensive labeling costs. Unfortunately, in standard worst-case models of learning, the active setting often provides no improvement over non-adaptive algorithms. To combat this, a series of recent works have considered a model in which the learner may ask \emph{enriched} queries beyond labels. While such models have seen success in drastically lowering label costs, they tend to come at the expense of requiring large amounts of memory. In this work, we study what families of classifiers can be learned in \emph{bounded memory}. To this end, we introduce a novel streaming-variant of enriched-query active learning along with a natural combinatorial parameter called \emph{lossless sample compression} that is sufficient for learning not only with bounded memory, but in a query-optimal and computationally efficient manner as well. Finally, we give three fundamental examples of classifier families with small, easy to compute lossless compression schemes when given access to basic enriched queries: axis-aligned rectangles, decision trees, and halfspaces in two dimensions.
APA
Hopkins, M., Kane, D., Lovett, S. & Moshkovitz, M.. (2021). Bounded Memory Active Learning through Enriched Queries. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2358-2387 Available from https://proceedings.mlr.press/v134/hopkins21a.html.

Related Material