Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration

Shuang Ao; Stefan Rueger; Advaith Siddharthan

Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration

Shuang Ao, Stefan Rueger, Advaith Siddharthan

Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:77-87, 2023.

Abstract

Proper confidence calibration of deep neural networks is essential for reliable predictions in safety-critical tasks. Miscalibration can lead to model over-confidence and/or under-confidence; i.e., the model’s confidence in its prediction can be greater or less than the model’s accuracy. Recent studies have highlighted the over-confidence issue by introducing calibration techniques and demonstrated success on various tasks. However, miscalibration through under-confidence has not yet to receive much attention. In this paper, we address the necessity of paying attention to the under-confidence issue. We first introduce a novel metric, a miscalibration score, to identify the overall and class-wise calibration status, including being over or under-confident. Our proposed metric reveals the pitfalls of existing calibration techniques, where they often overly calibrate the model and worsen under-confident predictions. Then we utilize the class-wise miscalibration score as a proxy to design a calibration technique that can tackle both over and under-confidence. We report extensive experiments that show our proposed methods substantially outperforming existing calibration techniques. We also validate our proposed calibration technique on an automatic failure detection task with a risk-coverage curve, reporting that our methods improve failure detection as well as trustworthiness of the model. The code are available at \url{https://github.com/AoShuang92/miscalibration_TS}.

Cite this Paper

BibTeX


@InProceedings{pmlr-v216-ao23a,
  title = 	 {Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration},
  author =       {Ao, Shuang and Rueger, Stefan and Siddharthan, Advaith},
  booktitle = 	 {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {77--87},
  year = 	 {2023},
  editor = 	 {Evans, Robin J. and Shpitser, Ilya},
  volume = 	 {216},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {31 Jul--04 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v216/ao23a/ao23a.pdf},
  url = 	 {https://proceedings.mlr.press/v216/ao23a.html},
  abstract = 	 {Proper confidence calibration of deep neural networks is essential for reliable predictions in safety-critical tasks. Miscalibration can lead to model over-confidence and/or under-confidence; i.e., the model’s confidence in its prediction can be greater or less than the model’s accuracy. Recent studies have highlighted the over-confidence issue by introducing calibration techniques and demonstrated success on various tasks. However, miscalibration through under-confidence has not yet to receive much attention. In this paper, we address the necessity of paying attention to the under-confidence issue. We first introduce a novel metric, a miscalibration score, to identify the overall and class-wise calibration status, including being over or under-confident. Our proposed metric reveals the pitfalls of existing calibration techniques, where they often overly calibrate the model and worsen under-confident predictions. Then we utilize the class-wise miscalibration score as a proxy to design a calibration technique that can tackle both over and under-confidence. We report extensive experiments that show our proposed methods substantially outperforming existing calibration techniques. We also validate our proposed calibration technique on an automatic failure detection task with a risk-coverage curve, reporting that our methods improve failure detection as well as trustworthiness of the model. The code are available at \url{https://github.com/AoShuang92/miscalibration_TS}.}
}

Endnote

%0 Conference Paper
%T Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration
%A Shuang Ao
%A Stefan Rueger
%A Advaith Siddharthan
%B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2023
%E Robin J. Evans
%E Ilya Shpitser	
%F pmlr-v216-ao23a
%I PMLR
%P 77--87
%U https://proceedings.mlr.press/v216/ao23a.html
%V 216
%X Proper confidence calibration of deep neural networks is essential for reliable predictions in safety-critical tasks. Miscalibration can lead to model over-confidence and/or under-confidence; i.e., the model’s confidence in its prediction can be greater or less than the model’s accuracy. Recent studies have highlighted the over-confidence issue by introducing calibration techniques and demonstrated success on various tasks. However, miscalibration through under-confidence has not yet to receive much attention. In this paper, we address the necessity of paying attention to the under-confidence issue. We first introduce a novel metric, a miscalibration score, to identify the overall and class-wise calibration status, including being over or under-confident. Our proposed metric reveals the pitfalls of existing calibration techniques, where they often overly calibrate the model and worsen under-confident predictions. Then we utilize the class-wise miscalibration score as a proxy to design a calibration technique that can tackle both over and under-confidence. We report extensive experiments that show our proposed methods substantially outperforming existing calibration techniques. We also validate our proposed calibration technique on an automatic failure detection task with a risk-coverage curve, reporting that our methods improve failure detection as well as trustworthiness of the model. The code are available at \url{https://github.com/AoShuang92/miscalibration_TS}.

APA


Ao, S., Rueger, S. & Siddharthan, A.. (2023). Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:77-87 Available from https://proceedings.mlr.press/v216/ao23a.html.

Two Sides of Miscalibration: Identifying Over and Under-Confidence Prediction for Network Calibration

Abstract

Cite this Paper

Related Material