Is Multi-Distribution Learning as Easy as PAC Learning: Sharp Rates with Bounded Label Noise

Rafael Hanashiro, Abhishek Shetty, Patrick Jaillet
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:3109-3142, 2026.

Abstract

Towards understanding the statistical complexity of learning from heterogeneous sources, we study the problem of multi-distribution learning. Given $k$ data sources, the goal is to output a classifier for each source by exploiting shared structure to reduce sample complexity. We focus on the bounded label noise setting to determine whether the fast $1/\epsilon$ rates achievable in single-task learning extend to this regime with minimal dependence on $k$. Surprisingly, we show that this is not the case. We demonstrate that learning across $k$ distributions inherently incurs slow rates scaling with $k/\epsilon^2$, even under constant noise levels, unless each distribution is learned separately. A key technical contribution is a structured hypothesis-testing framework that captures the statistical cost of certifying near-optimality under bounded noise–a cost we show is unavoidable in the multi-distribution setting. Finally, we prove that when competing with the stronger benchmark of each distribution’s optimal Bayes error, the sample complexity incurs a multiplicative penalty in $k$. This establishes a statistical separation between random classification noise and Massart noise, highlighting a fundamental barrier unique to learning from multiple sources.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-hanashiro26a, title = {Is {Multi-Distribution Learning} as Easy as {PAC Learning}: Sharp Rates with Bounded Label Noise}, author = {Hanashiro, Rafael and Shetty, Abhishek and Jaillet, Patrick}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {3109--3142}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/hanashiro26a/hanashiro26a.pdf}, url = {https://proceedings.mlr.press/v336/hanashiro26a.html}, abstract = {Towards understanding the statistical complexity of learning from heterogeneous sources, we study the problem of multi-distribution learning. Given $k$ data sources, the goal is to output a classifier for each source by exploiting shared structure to reduce sample complexity. We focus on the bounded label noise setting to determine whether the fast $1/\epsilon$ rates achievable in single-task learning extend to this regime with minimal dependence on $k$. Surprisingly, we show that this is not the case. We demonstrate that learning across $k$ distributions inherently incurs slow rates scaling with $k/\epsilon^2$, even under constant noise levels, unless each distribution is learned separately. A key technical contribution is a structured hypothesis-testing framework that captures the statistical cost of certifying near-optimality under bounded noise–a cost we show is unavoidable in the multi-distribution setting. Finally, we prove that when competing with the stronger benchmark of each distribution’s optimal Bayes error, the sample complexity incurs a multiplicative penalty in $k$. This establishes a statistical separation between random classification noise and Massart noise, highlighting a fundamental barrier unique to learning from multiple sources.} }
Endnote
%0 Conference Paper %T Is Multi-Distribution Learning as Easy as PAC Learning: Sharp Rates with Bounded Label Noise %A Rafael Hanashiro %A Abhishek Shetty %A Patrick Jaillet %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-hanashiro26a %I PMLR %P 3109--3142 %U https://proceedings.mlr.press/v336/hanashiro26a.html %V 336 %X Towards understanding the statistical complexity of learning from heterogeneous sources, we study the problem of multi-distribution learning. Given $k$ data sources, the goal is to output a classifier for each source by exploiting shared structure to reduce sample complexity. We focus on the bounded label noise setting to determine whether the fast $1/\epsilon$ rates achievable in single-task learning extend to this regime with minimal dependence on $k$. Surprisingly, we show that this is not the case. We demonstrate that learning across $k$ distributions inherently incurs slow rates scaling with $k/\epsilon^2$, even under constant noise levels, unless each distribution is learned separately. A key technical contribution is a structured hypothesis-testing framework that captures the statistical cost of certifying near-optimality under bounded noise–a cost we show is unavoidable in the multi-distribution setting. Finally, we prove that when competing with the stronger benchmark of each distribution’s optimal Bayes error, the sample complexity incurs a multiplicative penalty in $k$. This establishes a statistical separation between random classification noise and Massart noise, highlighting a fundamental barrier unique to learning from multiple sources.
APA
Hanashiro, R., Shetty, A. & Jaillet, P.. (2026). Is Multi-Distribution Learning as Easy as PAC Learning: Sharp Rates with Bounded Label Noise. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:3109-3142 Available from https://proceedings.mlr.press/v336/hanashiro26a.html.

Related Material