Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings

Aviral Kumar; Sunita Sarawagi; Ujjwal Jain

Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings

Aviral Kumar, Sunita Sarawagi, Ujjwal Jain

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:2805-2814, 2018.

Abstract

Modern neural networks have recently been found to be poorly calibrated, primarily in the direction of over-confidence. Methods like entropy penalty and temperature smoothing improve calibration by clamping confidence, but in doing so compromise the many legitimately confident predictions. We propose a more principled fix that minimizes an explicit calibration error during training. We present MMCE, a RKHS kernel based measure of calibration that is efficiently trainable alongside the negative likelihood loss without careful hyper-parameter tuning. Theoretically too, MMCE is a sound measure of calibration that is minimized at perfect calibration, and whose finite sample estimates are consistent and enjoy fast convergence rates. Extensive experiments on several network architectures demonstrate that MMCE is a fast, stable, and accurate method to minimize calibration error while maximally preserving the number of high confidence predictions.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-kumar18a,
  title = 	 {Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings},
  author =       {Kumar, Aviral and Sarawagi, Sunita and Jain, Ujjwal},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {2805--2814},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/kumar18a/kumar18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/kumar18a.html},
  abstract = 	 {Modern neural networks have recently been found to be poorly calibrated, primarily in the direction of over-confidence. Methods like entropy penalty and temperature smoothing improve calibration by clamping confidence, but in doing so compromise the many legitimately confident predictions. We propose a more principled fix that minimizes an explicit calibration error during training. We present MMCE, a RKHS kernel based measure of calibration that is efficiently trainable alongside the negative likelihood loss without careful hyper-parameter tuning. Theoretically too, MMCE is a sound measure of calibration that is minimized at perfect calibration, and whose finite sample estimates are consistent and enjoy fast convergence rates. Extensive experiments on several network architectures demonstrate that MMCE is a fast, stable, and accurate method to minimize calibration error while maximally preserving the number of high confidence predictions.}
}

Endnote

%0 Conference Paper
%T Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings
%A Aviral Kumar
%A Sunita Sarawagi
%A Ujjwal Jain
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-kumar18a
%I PMLR
%P 2805--2814
%U https://proceedings.mlr.press/v80/kumar18a.html
%V 80
%X Modern neural networks have recently been found to be poorly calibrated, primarily in the direction of over-confidence. Methods like entropy penalty and temperature smoothing improve calibration by clamping confidence, but in doing so compromise the many legitimately confident predictions. We propose a more principled fix that minimizes an explicit calibration error during training. We present MMCE, a RKHS kernel based measure of calibration that is efficiently trainable alongside the negative likelihood loss without careful hyper-parameter tuning. Theoretically too, MMCE is a sound measure of calibration that is minimized at perfect calibration, and whose finite sample estimates are consistent and enjoy fast convergence rates. Extensive experiments on several network architectures demonstrate that MMCE is a fast, stable, and accurate method to minimize calibration error while maximally preserving the number of high confidence predictions.

APA


Kumar, A., Sarawagi, S. & Jain, U.. (2018). Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:2805-2814 Available from https://proceedings.mlr.press/v80/kumar18a.html.

Trainable Calibration Measures for Neural Networks from Kernel Mean Embeddings

Abstract

Cite this Paper

Related Material