Nearly Optimal Sample Complexity for Learning with Label Proportions

Robert Istvan Busa-Fekete, Travis Dick, Claudio Gentile, Haim Kaplan, Tomer Koren, Uri Stemmer
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:5943-5976, 2025.

Abstract

We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the goal is still to achieve small regret at the level of individual examples. We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal. From an algorithmic viewpoint, we rely on carefully designed variants of Empirical Risk Minimization, and Stochastic Gradient Descent algorithms, combined with ad hoc variance reduction techniques. On one hand, our theoretical results improve in important ways on the existing literature on LLP, specifically in the way the sample complexity depends on the bag size. On the other hand, we validate our algorithmic solutions on several datasets, demonstrating improved empirical performance (better accuracy for less samples) against recent baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-busa-fekete25a, title = {Nearly Optimal Sample Complexity for Learning with Label Proportions}, author = {Busa-Fekete, Robert Istvan and Dick, Travis and Gentile, Claudio and Kaplan, Haim and Koren, Tomer and Stemmer, Uri}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {5943--5976}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/busa-fekete25a/busa-fekete25a.pdf}, url = {https://proceedings.mlr.press/v267/busa-fekete25a.html}, abstract = {We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the goal is still to achieve small regret at the level of individual examples. We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal. From an algorithmic viewpoint, we rely on carefully designed variants of Empirical Risk Minimization, and Stochastic Gradient Descent algorithms, combined with ad hoc variance reduction techniques. On one hand, our theoretical results improve in important ways on the existing literature on LLP, specifically in the way the sample complexity depends on the bag size. On the other hand, we validate our algorithmic solutions on several datasets, demonstrating improved empirical performance (better accuracy for less samples) against recent baselines.} }
Endnote
%0 Conference Paper %T Nearly Optimal Sample Complexity for Learning with Label Proportions %A Robert Istvan Busa-Fekete %A Travis Dick %A Claudio Gentile %A Haim Kaplan %A Tomer Koren %A Uri Stemmer %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-busa-fekete25a %I PMLR %P 5943--5976 %U https://proceedings.mlr.press/v267/busa-fekete25a.html %V 267 %X We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags, and only aggregate label values in each bag are available. Despite the partial observability, the goal is still to achieve small regret at the level of individual examples. We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal. From an algorithmic viewpoint, we rely on carefully designed variants of Empirical Risk Minimization, and Stochastic Gradient Descent algorithms, combined with ad hoc variance reduction techniques. On one hand, our theoretical results improve in important ways on the existing literature on LLP, specifically in the way the sample complexity depends on the bag size. On the other hand, we validate our algorithmic solutions on several datasets, demonstrating improved empirical performance (better accuracy for less samples) against recent baselines.
APA
Busa-Fekete, R.I., Dick, T., Gentile, C., Kaplan, H., Koren, T. & Stemmer, U.. (2025). Nearly Optimal Sample Complexity for Learning with Label Proportions. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:5943-5976 Available from https://proceedings.mlr.press/v267/busa-fekete25a.html.

Related Material