Agnostic Active Learning of Single Index Models with Linear Sample Complexity

Aarshvi Gajjar, Wai Ming Tai, Xu Xingyu, Chinmay Hegde, Christopher Musco, Yi Li
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:1715-1754, 2024.

Abstract

We study active learning methods for single index models of the form $F({\bm x}) = f(⟨{\bm w}, {\bm x}⟩)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\bx,\bm w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when $f$ is known and Lipschitz, we show that $\tilde{O}(d)$ samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent ${O}(d^{2})$ bound of Gajjar et. al 2023. Second, we show that $\tilde{O}(d)$ samples suffice even in the more difficult setting when $f$ is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley’s inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v247-gajjar24a, title = {Agnostic Active Learning of Single Index Models with Linear Sample Complexity}, author = {Gajjar, Aarshvi and Tai, Wai Ming and Xingyu, Xu and Hegde, Chinmay and Musco, Christopher and Li, Yi}, booktitle = {Proceedings of Thirty Seventh Conference on Learning Theory}, pages = {1715--1754}, year = {2024}, editor = {Agrawal, Shipra and Roth, Aaron}, volume = {247}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v247/gajjar24a/gajjar24a.pdf}, url = {https://proceedings.mlr.press/v247/gajjar24a.html}, abstract = { We study active learning methods for single index models of the form $F({\bm x}) = f(⟨{\bm w}, {\bm x}⟩)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\bx,\bm w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when $f$ is known and Lipschitz, we show that $\tilde{O}(d)$ samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent ${O}(d^{2})$ bound of Gajjar et. al 2023. Second, we show that $\tilde{O}(d)$ samples suffice even in the more difficult setting when $f$ is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley’s inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.} }
Endnote
%0 Conference Paper %T Agnostic Active Learning of Single Index Models with Linear Sample Complexity %A Aarshvi Gajjar %A Wai Ming Tai %A Xu Xingyu %A Chinmay Hegde %A Christopher Musco %A Yi Li %B Proceedings of Thirty Seventh Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2024 %E Shipra Agrawal %E Aaron Roth %F pmlr-v247-gajjar24a %I PMLR %P 1715--1754 %U https://proceedings.mlr.press/v247/gajjar24a.html %V 247 %X We study active learning methods for single index models of the form $F({\bm x}) = f(⟨{\bm w}, {\bm x}⟩)$, where $f:\mathbb{R} \to \mathbb{R}$ and ${\bx,\bm w} \in \mathbb{R}^d$. In addition to their theoretical interest as simple examples of non-linear neural networks, single index models have received significant recent attention due to applications in scientific machine learning like surrogate modeling for partial differential equations (PDEs). Such applications require sample-efficient active learning methods that are robust to adversarial noise. I.e., that work even in the challenging agnostic learning setting. We provide two main results on agnostic active learning of single index models. First, when $f$ is known and Lipschitz, we show that $\tilde{O}(d)$ samples collected via {statistical leverage score sampling} are sufficient to learn a near-optimal single index model. Leverage score sampling is simple to implement, efficient, and already widely used for actively learning linear models. Our result requires no assumptions on the data distribution, is optimal up to log factors, and improves quadratically on a recent ${O}(d^{2})$ bound of Gajjar et. al 2023. Second, we show that $\tilde{O}(d)$ samples suffice even in the more difficult setting when $f$ is \emph{unknown}. Our results leverage tools from high dimensional probability, including Dudley’s inequality and dual Sudakov minoration, as well as a novel, distribution-aware discretization of the class of Lipschitz functions.
APA
Gajjar, A., Tai, W.M., Xingyu, X., Hegde, C., Musco, C. & Li, Y.. (2024). Agnostic Active Learning of Single Index Models with Linear Sample Complexity. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:1715-1754 Available from https://proceedings.mlr.press/v247/gajjar24a.html.

Related Material