Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

Yinan Li; Chicheng Zhang

Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach

Yinan Li, Chicheng Zhang

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4744-4752, 2024.

Abstract

We study the problem of computationally and label efficient PAC active learning

$d$ -dimensional halfspaces with Tsybakov Noise (Tsybakov, 2004) under structured unlabeled data distributions. Inspired by Diakonikolas et al., (2020c), we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of

$\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$ , under the assumption that the Tsybakov noise parameter

$\alpha \in (\frac13, 1]$ , which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms (Diakonikolas et al., 2020b; Zhang and Li, 2021) and the information-theoretic lower bound in this setting.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-li24p,
  title = 	 {Efficient Active Learning Halfspaces with {T}sybakov Noise: A Non-convex Optimization Approach},
  author =       {Li, Yinan and Zhang, Chicheng},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {4744--4752},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/li24p/li24p.pdf},
  url = 	 {https://proceedings.mlr.press/v238/li24p.html},
  abstract = 	 {We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise (Tsybakov, 2004) under structured unlabeled data distributions. Inspired by Diakonikolas et al., (2020c), we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$, under the assumption that the Tsybakov noise parameter $\alpha \in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms (Diakonikolas et al., 2020b; Zhang and Li, 2021) and the information-theoretic lower bound in this setting.}
}

Endnote

%0 Conference Paper
%T Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach
%A Yinan Li
%A Chicheng Zhang
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-li24p
%I PMLR
%P 4744--4752
%U https://proceedings.mlr.press/v238/li24p.html
%V 238
%X We study the problem of computationally and label efficient PAC active learning $d$-dimensional halfspaces with Tsybakov Noise (Tsybakov, 2004) under structured unlabeled data distributions. Inspired by Diakonikolas et al., (2020c), we prove that any approximate first-order stationary point of a smooth nonconvex loss function yields a halfspace with a low excess error guarantee. In light of the above structural result, we design a nonconvex optimization-based algorithm with a label complexity of $\tilde{O}(d (\frac{1}{\epsilon})^{\frac{8-6\alpha}{3\alpha-1}})$, under the assumption that the Tsybakov noise parameter $\alpha \in (\frac13, 1]$, which narrows down the gap between the label complexities of the previously known efficient passive or active algorithms (Diakonikolas et al., 2020b; Zhang and Li, 2021) and the information-theoretic lower bound in this setting.

APA

Li, Y. & Zhang, C.. (2024). Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4744-4752 Available from https://proceedings.mlr.press/v238/li24p.html.

Related Material

Download PDF