Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes

Pranjal Awasthi; Nika Haghtalab; Eric Zhao

Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes

Pranjal Awasthi, Nika Haghtalab, Eric Zhao

Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:5943-5949, 2023.

Abstract

Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension

$d$ class on

$k$ distributions to be

$O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k, \epsilon^{-4} \ln(k) d\})$ , the best lower bound is

$\Omega(\epsilon^{-2}(d + k \ln(k)))$ . We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.

Cite this Paper

BibTeX


@InProceedings{pmlr-v195-awasthi23a,
  title = 	 {Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes},
  author =       {Awasthi, Pranjal and Haghtalab, Nika and Zhao, Eric},
  booktitle = 	 {Proceedings of Thirty Sixth Conference on Learning Theory},
  pages = 	 {5943--5949},
  year = 	 {2023},
  editor = 	 {Neu, Gergely and Rosasco, Lorenzo},
  volume = 	 {195},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {12--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v195/awasthi23a/awasthi23a.pdf},
  url = 	 {https://proceedings.mlr.press/v195/awasthi23a.html},
  abstract = 	 {Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension $d$ class on $k$ distributions to be $O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k,  \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.}
}

Endnote

%0 Conference Paper
%T Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes
%A Pranjal Awasthi
%A Nika Haghtalab
%A Eric Zhao
%B Proceedings of Thirty Sixth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2023
%E Gergely Neu
%E Lorenzo Rosasco	
%F pmlr-v195-awasthi23a
%I PMLR
%P 5943--5949
%U https://proceedings.mlr.press/v195/awasthi23a.html
%V 195
%X Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension $d$ class on $k$ distributions to be $O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k,  \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.

APA


Awasthi, P., Haghtalab, N. & Zhao, E.. (2023). Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:5943-5949 Available from https://proceedings.mlr.press/v195/awasthi23a.html.

Related Material

Download PDF