Efficient Learning with Arbitrary Covariate Shift

Adam Tauman Kalai; Varun Kanade

Efficient Learning with Arbitrary Covariate Shift

Adam Tauman Kalai, Varun Kanade

Proceedings of the 32nd International Conference on Algorithmic Learning Theory, PMLR 132:850-864, 2021.

Abstract

We give an efficient algorithm for learning a binary function in a given class

$C$ of bounded VC dimension, with training data distributed according to

$P$ and test data according to

$Q$ , where

$P$ and

$Q$ may be arbitrary distributions over

$X$ . This is the generic form of what is called \textit{covariate shift}, which is impossible in general as arbitrary

$P$ and

$Q$ may not even overlap. However, recently guarantees were given in a model called PQ-learning (Goldwasser et al., 2020) where the learner has: (a) access to unlabeled test examples from

$Q$ (in addition to labeled samples from

$P$ , i.e., semi-supervised learning); and (b) the option to \textit{reject} any example and abstain from classifying it (i.e., selective classification). The algorithm of Goldwasser et al. (2020) requires an (agnostic) noise-tolerant learner for

$C$ . The present work gives a polynomial-time PQ-learning algorithm, called \textit{Slice-and-Dice}, that uses an oracle to a “reliable” learner for

$C$ , where reliable learning (Kalai et al., 2012) is a model of learning with one-sided noise. Furthermore, this reduction is optimal in the sense that we show the equivalence of reliable and PQ learning.

Cite this Paper

BibTeX


@InProceedings{pmlr-v132-kalai21a,
  title = 	 {Efficient Learning with Arbitrary Covariate Shift},
  author =       {Kalai, Adam Tauman and Kanade, Varun},
  booktitle = 	 {Proceedings of the 32nd International Conference on Algorithmic Learning Theory},
  pages = 	 {850--864},
  year = 	 {2021},
  editor = 	 {Feldman, Vitaly and Ligett, Katrina and Sabato, Sivan},
  volume = 	 {132},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {16--19 Mar},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v132/kalai21a/kalai21a.pdf},
  url = 	 {https://proceedings.mlr.press/v132/kalai21a.html},
  abstract = 	 {We give an efficient algorithm for learning a binary function in a given class $C$ of bounded VC dimension, with training data distributed according to $P$ and test data according to $Q$, where $P$ and $Q$ may be arbitrary distributions over $X$. This is the generic form of what is called \textit{covariate shift}, which is impossible in general as arbitrary $P$ and $Q$ may not even overlap. However, recently guarantees were given in a model called PQ-learning (Goldwasser et al., 2020) where the learner has: (a) access to unlabeled test examples from $Q$ (in addition to labeled samples from $P$, i.e., semi-supervised learning); and (b) the option to \textit{reject} any example and abstain from classifying it (i.e., selective classification). The algorithm of Goldwasser et al. (2020) requires an (agnostic) noise-tolerant learner for $C$. The present work gives a polynomial-time PQ-learning algorithm, called \textit{Slice-and-Dice}, that uses an oracle to a “reliable” learner for $C$, where reliable learning (Kalai et al., 2012) is a model of learning with one-sided noise. Furthermore, this reduction is optimal in the sense that we show the equivalence of reliable and PQ learning.}
}

Endnote

%0 Conference Paper
%T Efficient Learning with Arbitrary Covariate Shift
%A Adam Tauman Kalai
%A Varun Kanade
%B Proceedings of the 32nd International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2021
%E Vitaly Feldman
%E Katrina Ligett
%E Sivan Sabato	
%F pmlr-v132-kalai21a
%I PMLR
%P 850--864
%U https://proceedings.mlr.press/v132/kalai21a.html
%V 132
%X We give an efficient algorithm for learning a binary function in a given class $C$ of bounded VC dimension, with training data distributed according to $P$ and test data according to $Q$, where $P$ and $Q$ may be arbitrary distributions over $X$. This is the generic form of what is called \textit{covariate shift}, which is impossible in general as arbitrary $P$ and $Q$ may not even overlap. However, recently guarantees were given in a model called PQ-learning (Goldwasser et al., 2020) where the learner has: (a) access to unlabeled test examples from $Q$ (in addition to labeled samples from $P$, i.e., semi-supervised learning); and (b) the option to \textit{reject} any example and abstain from classifying it (i.e., selective classification). The algorithm of Goldwasser et al. (2020) requires an (agnostic) noise-tolerant learner for $C$. The present work gives a polynomial-time PQ-learning algorithm, called \textit{Slice-and-Dice}, that uses an oracle to a “reliable” learner for $C$, where reliable learning (Kalai et al., 2012) is a model of learning with one-sided noise. Furthermore, this reduction is optimal in the sense that we show the equivalence of reliable and PQ learning.

APA


Kalai, A.T. & Kanade, V.. (2021). Efficient Learning with Arbitrary Covariate Shift. Proceedings of the 32nd International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 132:850-864 Available from https://proceedings.mlr.press/v132/kalai21a.html.

Related Material

Download PDF