A Large-Scale Study of Probabilistic Calibration in Neural Network Regression

Victor Dheur, Souhaib Ben Taieb
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:7813-7836, 2023.

Abstract

Accurate probabilistic predictions are essential for optimal decision making. While neural network miscalibration has been studied primarily in classification, we investigate this in the less-explored domain of regression. We conduct the largest empirical study to date to assess the probabilistic calibration of neural networks. We also analyze the performance of recalibration, conformal, and regularization methods to enhance probabilistic calibration. Additionally, we introduce novel differentiable recalibration and regularization methods, uncovering new insights into their effectiveness. Our findings reveal that regularization methods offer a favorable tradeoff between calibration and sharpness. Post-hoc methods exhibit superior probabilistic calibration, which we attribute to the finite-sample coverage guarantee of conformal prediction. Furthermore, we demonstrate that quantile recalibration can be considered as a specific case of conformal prediction. Our study is fully reproducible and implemented in a common code base for fair comparisons.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-dheur23a, title = {A Large-Scale Study of Probabilistic Calibration in Neural Network Regression}, author = {Dheur, Victor and Ben Taieb, Souhaib}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {7813--7836}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/dheur23a/dheur23a.pdf}, url = {https://proceedings.mlr.press/v202/dheur23a.html}, abstract = {Accurate probabilistic predictions are essential for optimal decision making. While neural network miscalibration has been studied primarily in classification, we investigate this in the less-explored domain of regression. We conduct the largest empirical study to date to assess the probabilistic calibration of neural networks. We also analyze the performance of recalibration, conformal, and regularization methods to enhance probabilistic calibration. Additionally, we introduce novel differentiable recalibration and regularization methods, uncovering new insights into their effectiveness. Our findings reveal that regularization methods offer a favorable tradeoff between calibration and sharpness. Post-hoc methods exhibit superior probabilistic calibration, which we attribute to the finite-sample coverage guarantee of conformal prediction. Furthermore, we demonstrate that quantile recalibration can be considered as a specific case of conformal prediction. Our study is fully reproducible and implemented in a common code base for fair comparisons.} }
Endnote
%0 Conference Paper %T A Large-Scale Study of Probabilistic Calibration in Neural Network Regression %A Victor Dheur %A Souhaib Ben Taieb %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-dheur23a %I PMLR %P 7813--7836 %U https://proceedings.mlr.press/v202/dheur23a.html %V 202 %X Accurate probabilistic predictions are essential for optimal decision making. While neural network miscalibration has been studied primarily in classification, we investigate this in the less-explored domain of regression. We conduct the largest empirical study to date to assess the probabilistic calibration of neural networks. We also analyze the performance of recalibration, conformal, and regularization methods to enhance probabilistic calibration. Additionally, we introduce novel differentiable recalibration and regularization methods, uncovering new insights into their effectiveness. Our findings reveal that regularization methods offer a favorable tradeoff between calibration and sharpness. Post-hoc methods exhibit superior probabilistic calibration, which we attribute to the finite-sample coverage guarantee of conformal prediction. Furthermore, we demonstrate that quantile recalibration can be considered as a specific case of conformal prediction. Our study is fully reproducible and implemented in a common code base for fair comparisons.
APA
Dheur, V. & Ben Taieb, S.. (2023). A Large-Scale Study of Probabilistic Calibration in Neural Network Regression. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:7813-7836 Available from https://proceedings.mlr.press/v202/dheur23a.html.

Related Material