Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors

Teodora Popordanoska, Sebastian Gregor Gruber, Aleksei Tiulpin, Florian Buettner, Matthew B. Blaschko
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3466-3474, 2024.

Abstract

Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components – proper calibration error and refinement – utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimator for these quantities with known statistical properties. To address this gap, we propose a method that allows consistent, and asymptotically unbiased estimation of all proper calibration errors and refinement terms. In particular, we introduce Kullback-Leibler calibration error, induced by the commonly used cross-entropy loss. As part of our results, we prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks, regardless of which proper scoring rule is optimized. Our experiments validate empirically the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-popordanoska24a, title = { Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors }, author = {Popordanoska, Teodora and Gregor Gruber, Sebastian and Tiulpin, Aleksei and Buettner, Florian and B. Blaschko, Matthew}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3466--3474}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/popordanoska24a/popordanoska24a.pdf}, url = {https://proceedings.mlr.press/v238/popordanoska24a.html}, abstract = { Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components – proper calibration error and refinement – utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimator for these quantities with known statistical properties. To address this gap, we propose a method that allows consistent, and asymptotically unbiased estimation of all proper calibration errors and refinement terms. In particular, we introduce Kullback-Leibler calibration error, induced by the commonly used cross-entropy loss. As part of our results, we prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks, regardless of which proper scoring rule is optimized. Our experiments validate empirically the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest. } }
Endnote
%0 Conference Paper %T Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors %A Teodora Popordanoska %A Sebastian Gregor Gruber %A Aleksei Tiulpin %A Florian Buettner %A Matthew B. Blaschko %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-popordanoska24a %I PMLR %P 3466--3474 %U https://proceedings.mlr.press/v238/popordanoska24a.html %V 238 %X Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components – proper calibration error and refinement – utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimator for these quantities with known statistical properties. To address this gap, we propose a method that allows consistent, and asymptotically unbiased estimation of all proper calibration errors and refinement terms. In particular, we introduce Kullback-Leibler calibration error, induced by the commonly used cross-entropy loss. As part of our results, we prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks, regardless of which proper scoring rule is optimized. Our experiments validate empirically the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest.
APA
Popordanoska, T., Gregor Gruber, S., Tiulpin, A., Buettner, F. & B. Blaschko, M.. (2024). Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3466-3474 Available from https://proceedings.mlr.press/v238/popordanoska24a.html.

Related Material