List Sample Compression and Uniform Convergence

Steve Hanneke; Shay Moran; Waknine Tom

List Sample Compression and Uniform Convergence

Steve Hanneke, Shay Moran, Waknine Tom

Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:2360-2388, 2024.

Abstract

List learning is a variant of supervised classification where the learner outputs multiple plausible labels for each instance rather than just one. We investigate classical principles related to generalization within the context of list learning. Our primary goal is to determine whether classical principles in the PAC setting retain their applicability in the domain of list PAC learning. We focus on uniform convergence (which is the basis of Empirical Risk Minimization) and on sample compression (which is a powerful manifestation of Occam’s Razor). In classical PAC learning, both uniform convergence and sample compression satisfy a form of ‘completeness’: whenever a class is learnable, it can also be learned by a learning rule that adheres to these principles. We ask whether the same completeness holds true in the list learning setting. We show that uniform convergence remains equivalent to learnability in the list PAC learning setting. In contrast, our findings reveal surprising results regarding sample compression: we prove that when the label space is $Y=\{0,1,2\}$, then there are 2-list-learnable classes that cannot be compressed. This refutes the list version of the sample compression conjecture by Littlestone and Warmuth in 1986. We prove an even stronger impossibility result, showing that there are $2$-list-learnable classes that cannot be compressed even when the reconstructed function can work with lists of arbitrarily large size. We prove a similar result for (1-list) PAC learnable classes when the label space is unbounded.

Cite this Paper

BibTeX

@InProceedings{pmlr-v247-hanneke24b,
  title = 	 {List Sample Compression and Uniform Convergence},
  author =       {Hanneke, Steve and Moran, Shay and Tom, Waknine},
  booktitle = 	 {Proceedings of Thirty Seventh Conference on Learning Theory},
  pages = 	 {2360--2388},
  year = 	 {2024},
  editor = 	 {Agrawal, Shipra and Roth, Aaron},
  volume = 	 {247},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v247/hanneke24b/hanneke24b.pdf},
  url = 	 {https://proceedings.mlr.press/v247/hanneke24b.html},
  abstract = 	 {List learning is a variant of supervised classification where the learner outputs multiple plausible labels for each instance rather than just one.  We investigate classical principles related to generalization within the context of list learning. Our primary goal is to determine whether classical principles in the PAC setting retain their applicability in the domain of list PAC learning. We focus on uniform convergence (which is the basis of Empirical Risk Minimization) and on sample compression (which is a powerful manifestation of Occam’s Razor). In classical PAC learning, both uniform convergence and sample compression satisfy a form of ‘completeness’: whenever a class is learnable, it can also be learned by a learning rule that adheres to these principles. We ask whether the same completeness holds true in the list learning setting. We show that uniform convergence remains equivalent to learnability in the list PAC learning setting. In contrast, our findings reveal surprising results regarding sample compression: we prove that when the label space is $Y=\{0,1,2\}$, then there are 2-list-learnable classes that cannot be compressed. This refutes the list version of the sample compression conjecture by Littlestone and Warmuth in 1986. We prove an even stronger impossibility result, showing that there are $2$-list-learnable classes that cannot be compressed even when the reconstructed function can work with lists of arbitrarily large size. We prove a similar result for (1-list) PAC learnable classes when the label space is unbounded.}
}

Endnote

%0 Conference Paper
%T List Sample Compression and Uniform Convergence
%A Steve Hanneke
%A Shay Moran
%A Waknine Tom
%B Proceedings of Thirty Seventh Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2024
%E Shipra Agrawal
%E Aaron Roth	
%F pmlr-v247-hanneke24b
%I PMLR
%P 2360--2388
%U https://proceedings.mlr.press/v247/hanneke24b.html
%V 247
%X List learning is a variant of supervised classification where the learner outputs multiple plausible labels for each instance rather than just one.  We investigate classical principles related to generalization within the context of list learning. Our primary goal is to determine whether classical principles in the PAC setting retain their applicability in the domain of list PAC learning. We focus on uniform convergence (which is the basis of Empirical Risk Minimization) and on sample compression (which is a powerful manifestation of Occam’s Razor). In classical PAC learning, both uniform convergence and sample compression satisfy a form of ‘completeness’: whenever a class is learnable, it can also be learned by a learning rule that adheres to these principles. We ask whether the same completeness holds true in the list learning setting. We show that uniform convergence remains equivalent to learnability in the list PAC learning setting. In contrast, our findings reveal surprising results regarding sample compression: we prove that when the label space is $Y=\{0,1,2\}$, then there are 2-list-learnable classes that cannot be compressed. This refutes the list version of the sample compression conjecture by Littlestone and Warmuth in 1986. We prove an even stronger impossibility result, showing that there are $2$-list-learnable classes that cannot be compressed even when the reconstructed function can work with lists of arbitrarily large size. We prove a similar result for (1-list) PAC learnable classes when the label space is unbounded.

APA

Hanneke, S., Moran, S. & Tom, W.. (2024). List Sample Compression and Uniform Convergence. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:2360-2388 Available from https://proceedings.mlr.press/v247/hanneke24b.html.

Related Material

Download PDF