Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes

Pranjal Awasthi, Nika Haghtalab, Eric Zhao
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:5943-5949, 2023.

Abstract

Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension $d$ class on $k$ distributions to be $O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k, \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-awasthi23a, title = {Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes}, author = {Awasthi, Pranjal and Haghtalab, Nika and Zhao, Eric}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {5943--5949}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/awasthi23a/awasthi23a.pdf}, url = {https://proceedings.mlr.press/v195/awasthi23a.html}, abstract = {Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension $d$ class on $k$ distributions to be $O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k, \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.} }
Endnote
%0 Conference Paper %T Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes %A Pranjal Awasthi %A Nika Haghtalab %A Eric Zhao %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-awasthi23a %I PMLR %P 5943--5949 %U https://proceedings.mlr.press/v195/awasthi23a.html %V 195 %X Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension $d$ class on $k$ distributions to be $O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k, \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.
APA
Awasthi, P., Haghtalab, N. & Zhao, E.. (2023). Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:5943-5949 Available from https://proceedings.mlr.press/v195/awasthi23a.html.

Related Material