PAC-Bayes Analysis for Recalibration in Classification

Masahiro Fujisawa, Futoshi Futami
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:17986-18023, 2025.

Abstract

Nonparametric estimation using uniform-width binning is a standard approach for evaluating the calibration performance of machine learning models. However, existing theoretical analyses of the bias induced by binning are limited to binary classification, creating a significant gap with practical applications such as multiclass classification. Additionally, many parametric recalibration algorithms lack theoretical guarantees for their generalization performance. To address these issues, we conduct a generalization analysis of calibration error using the probably approximately correct Bayes framework. This approach enables us to derive the first optimizable upper bound for generalization error in the calibration context. On the basis of our theory, we propose a generalization-aware recalibration algorithm. Numerical experiments show that our algorithm enhances the performance of Gaussian process-based recalibration across various benchmark datasets and models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-fujisawa25a, title = {{PAC}-{B}ayes Analysis for Recalibration in Classification}, author = {Fujisawa, Masahiro and Futami, Futoshi}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {17986--18023}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/fujisawa25a/fujisawa25a.pdf}, url = {https://proceedings.mlr.press/v267/fujisawa25a.html}, abstract = {Nonparametric estimation using uniform-width binning is a standard approach for evaluating the calibration performance of machine learning models. However, existing theoretical analyses of the bias induced by binning are limited to binary classification, creating a significant gap with practical applications such as multiclass classification. Additionally, many parametric recalibration algorithms lack theoretical guarantees for their generalization performance. To address these issues, we conduct a generalization analysis of calibration error using the probably approximately correct Bayes framework. This approach enables us to derive the first optimizable upper bound for generalization error in the calibration context. On the basis of our theory, we propose a generalization-aware recalibration algorithm. Numerical experiments show that our algorithm enhances the performance of Gaussian process-based recalibration across various benchmark datasets and models.} }
Endnote
%0 Conference Paper %T PAC-Bayes Analysis for Recalibration in Classification %A Masahiro Fujisawa %A Futoshi Futami %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-fujisawa25a %I PMLR %P 17986--18023 %U https://proceedings.mlr.press/v267/fujisawa25a.html %V 267 %X Nonparametric estimation using uniform-width binning is a standard approach for evaluating the calibration performance of machine learning models. However, existing theoretical analyses of the bias induced by binning are limited to binary classification, creating a significant gap with practical applications such as multiclass classification. Additionally, many parametric recalibration algorithms lack theoretical guarantees for their generalization performance. To address these issues, we conduct a generalization analysis of calibration error using the probably approximately correct Bayes framework. This approach enables us to derive the first optimizable upper bound for generalization error in the calibration context. On the basis of our theory, we propose a generalization-aware recalibration algorithm. Numerical experiments show that our algorithm enhances the performance of Gaussian process-based recalibration across various benchmark datasets and models.
APA
Fujisawa, M. & Futami, F.. (2025). PAC-Bayes Analysis for Recalibration in Classification. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:17986-18023 Available from https://proceedings.mlr.press/v267/fujisawa25a.html.

Related Material