Active Heteroscedastic Regression


Kamalika Chaudhuri, Prateek Jain, Nagarajan Natarajan ;
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:694-702, 2017.


An active learner is given a model class $\Theta$, a large sample of unlabeled data drawn from an underlying distribution and access to a labeling oracle that can provide a label for any of the unlabeled instances. The goal of the learner is to find a model $\theta \in \Theta$ that fits the data to a given accuracy while making as few label queries to the oracle as possible. In this work, we consider a theoretical analysis of the label requirement of active learning for regression under a heteroscedastic noise model, where the noise depends on the instance. We provide bounds on the convergence rates of active and passive learning for heteroscedastic regression. Our results illustrate that just like in binary classification, some partial knowledge of the nature of the noise can lead to significant gains in the label requirement of active learning.

Related Material