Efficient active learning of sparse halfspaces

Chicheng Zhang

Efficient active learning of sparse halfspaces

Chicheng Zhang

Proceedings of the 31st Conference On Learning Theory, PMLR 75:1856-1880, 2018.

Abstract

We study the problem of efficient PAC active learning of homogeneous linear classifiers (halfspaces) in $\mathbb{R}^d$, where the goal is to learn a halfspace with low error using as few label queries as possible. Under the extra assumption that there is a $t$-sparse halfspace that performs well on the data ($t \ll d$), we would like our active learning algorithm to be {\em attribute efficient}, i.e. to have label requirements sublinear in $d$. In this paper, we provide a computationally efficient algorithm that achieves this goal. Under certain distributional assumptions on the data, our algorithm achieves a label complexity of $O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon))$. In contrast, existing algorithms in this setting are either computationally inefficient, or subject to label requirements polynomial in $d$ or $\frac 1 \epsilon$.

Cite this Paper

BibTeX


@InProceedings{pmlr-v75-zhang18b,
  title = 	 {Efficient active learning of sparse halfspaces},
  author =       {Zhang, Chicheng},
  booktitle = 	 {Proceedings of the 31st  Conference On Learning Theory},
  pages = 	 {1856--1880},
  year = 	 {2018},
  editor = 	 {Bubeck, Sébastien and Perchet, Vianney and Rigollet, Philippe},
  volume = 	 {75},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v75/zhang18b/zhang18b.pdf},
  url = 	 {https://proceedings.mlr.press/v75/zhang18b.html},
  abstract = 	 {We study the problem of efficient PAC active learning of homogeneous linear classifiers (halfspaces) in $\mathbb{R}^d$, where the goal is to learn a halfspace with low error using as few label queries as possible. Under the extra assumption that there is a $t$-sparse halfspace that performs well on the data ($t \ll d$), we would like our active learning algorithm to be {\em attribute efficient}, i.e. to have label requirements sublinear in $d$. In this paper, we provide a computationally efficient algorithm that achieves this goal. Under certain distributional assumptions on the data, our algorithm achieves a label complexity of $O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon))$. In contrast, existing algorithms in this setting are either computationally inefficient, or subject to label requirements polynomial in $d$ or $\frac 1 \epsilon$.}
}

Endnote

%0 Conference Paper
%T Efficient active learning of sparse halfspaces
%A Chicheng Zhang
%B Proceedings of the 31st  Conference On Learning Theory
%C Proceedings of Machine Learning Research
%D 2018
%E Sébastien Bubeck
%E Vianney Perchet
%E Philippe Rigollet	
%F pmlr-v75-zhang18b
%I PMLR
%P 1856--1880
%U https://proceedings.mlr.press/v75/zhang18b.html
%V 75
%X We study the problem of efficient PAC active learning of homogeneous linear classifiers (halfspaces) in $\mathbb{R}^d$, where the goal is to learn a halfspace with low error using as few label queries as possible. Under the extra assumption that there is a $t$-sparse halfspace that performs well on the data ($t \ll d$), we would like our active learning algorithm to be {\em attribute efficient}, i.e. to have label requirements sublinear in $d$. In this paper, we provide a computationally efficient algorithm that achieves this goal. Under certain distributional assumptions on the data, our algorithm achieves a label complexity of $O(t \cdot \mathrm{polylog}(d, \frac 1 \epsilon))$. In contrast, existing algorithms in this setting are either computationally inefficient, or subject to label requirements polynomial in $d$ or $\frac 1 \epsilon$.

APA


Zhang, C.. (2018). Efficient active learning of sparse halfspaces. Proceedings of the 31st  Conference On Learning Theory, in Proceedings of Machine Learning Research 75:1856-1880 Available from https://proceedings.mlr.press/v75/zhang18b.html.

Related Material

Download PDF