[edit]
$U$-ensembles: Improved diversity in the small data regime using unlabeled data
Proceedings of the 7th Symposium on Advances in Approximate Bayesian Inference, PMLR 289:131-167, 2025.
Abstract
We present a method to improve the calibration of deep ensembles in the small data regime in the presence of unlabeled data. Our approach, which we name $U$-ensembles, is extremely easy to implement: given an unlabeled set, for each unlabeled data point, we simply fit a different randomly selected label with each ensemble member. We provide a theoretical analysis based on a PAC-Bayes bound which guarantees that for such a labeling we obtain low negative log-likelihood and high ensemble diversity on testing samples. Empirically, through detailed experiments, we find that for low to moderately-sized training sets, $U$-ensembles are more diverse and provide better calibration than standard ensembles.