$U$-ensembles: Improved diversity in the small data regime using unlabeled data

Konstantinos Pitas, Hani Anouar Bourrous, Julyan Arbel
Proceedings of the 7th Symposium on Advances in Approximate Bayesian Inference, PMLR 289:131-167, 2025.

Abstract

We present a method to improve the calibration of deep ensembles in the small data regime in the presence of unlabeled data. Our approach, which we name $U$-ensembles, is extremely easy to implement: given an unlabeled set, for each unlabeled data point, we simply fit a different randomly selected label with each ensemble member. We provide a theoretical analysis based on a PAC-Bayes bound which guarantees that for such a labeling we obtain low negative log-likelihood and high ensemble diversity on testing samples. Empirically, through detailed experiments, we find that for low to moderately-sized training sets, $U$-ensembles are more diverse and provide better calibration than standard ensembles.

Cite this Paper


BibTeX
@InProceedings{pmlr-v289-pitas25a, title = {$U$-ensembles: Improved diversity in the small data regime using unlabeled data}, author = {Pitas, Konstantinos and Bourrous, Hani Anouar and Arbel, Julyan}, booktitle = {Proceedings of the 7th Symposium on Advances in Approximate Bayesian Inference}, pages = {131--167}, year = {2025}, editor = {Allingham, James Urquhart and Swaroop, Siddharth}, volume = {289}, series = {Proceedings of Machine Learning Research}, month = {29 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v289/main/assets/pitas25a/pitas25a.pdf}, url = {https://proceedings.mlr.press/v289/pitas25a.html}, abstract = {We present a method to improve the calibration of deep ensembles in the small data regime in the presence of unlabeled data. Our approach, which we name $U$-ensembles, is extremely easy to implement: given an unlabeled set, for each unlabeled data point, we simply fit a different randomly selected label with each ensemble member. We provide a theoretical analysis based on a PAC-Bayes bound which guarantees that for such a labeling we obtain low negative log-likelihood and high ensemble diversity on testing samples. Empirically, through detailed experiments, we find that for low to moderately-sized training sets, $U$-ensembles are more diverse and provide better calibration than standard ensembles.} }
Endnote
%0 Conference Paper %T $U$-ensembles: Improved diversity in the small data regime using unlabeled data %A Konstantinos Pitas %A Hani Anouar Bourrous %A Julyan Arbel %B Proceedings of the 7th Symposium on Advances in Approximate Bayesian Inference %C Proceedings of Machine Learning Research %D 2025 %E James Urquhart Allingham %E Siddharth Swaroop %F pmlr-v289-pitas25a %I PMLR %P 131--167 %U https://proceedings.mlr.press/v289/pitas25a.html %V 289 %X We present a method to improve the calibration of deep ensembles in the small data regime in the presence of unlabeled data. Our approach, which we name $U$-ensembles, is extremely easy to implement: given an unlabeled set, for each unlabeled data point, we simply fit a different randomly selected label with each ensemble member. We provide a theoretical analysis based on a PAC-Bayes bound which guarantees that for such a labeling we obtain low negative log-likelihood and high ensemble diversity on testing samples. Empirically, through detailed experiments, we find that for low to moderately-sized training sets, $U$-ensembles are more diverse and provide better calibration than standard ensembles.
APA
Pitas, K., Bourrous, H.A. & Arbel, J.. (2025). $U$-ensembles: Improved diversity in the small data regime using unlabeled data. Proceedings of the 7th Symposium on Advances in Approximate Bayesian Inference, in Proceedings of Machine Learning Research 289:131-167 Available from https://proceedings.mlr.press/v289/pitas25a.html.

Related Material