Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning

Jize Zhang, Bhavya Kailkhura, T. Yong-Jin Han
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:11117-11128, 2020.

Abstract

This paper studies the problem of post-hoc calibration of machine learning classifiers. We introduce the following desiderata for uncertainty calibration: (a) accuracy-preserving, (b) data-efficient, and (c) high expressive power. We show that none of the existing methods satisfy all three requirements, and demonstrate how Mix-n-Match calibration strategies (i.e., ensemble and composition) can help achieve remarkably better data-efficiency and expressive power while provably maintaining the classification accuracy of the original classifier. Mix-n-Match strategies are generic in the sense that they can be used to improve the performance of any off-the-shelf calibrator. We also reveal potential issues in standard evaluation practices. Popular approaches (e.g., histogram-based expected calibration error (ECE)) may provide misleading results especially in small-data regime. Therefore, we propose an alternative data-efficient kernel density-based estimator for a reliable evaluation of the calibration performance and prove its asymptotically unbiasedness and consistency. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64- llnl/Mix-n-Match-Calibration.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-zhang20k, title = {Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning}, author = {Zhang, Jize and Kailkhura, Bhavya and Han, T. Yong-Jin}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {11117--11128}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/zhang20k/zhang20k.pdf}, url = {https://proceedings.mlr.press/v119/zhang20k.html}, abstract = {This paper studies the problem of post-hoc calibration of machine learning classifiers. We introduce the following desiderata for uncertainty calibration: (a) accuracy-preserving, (b) data-efficient, and (c) high expressive power. We show that none of the existing methods satisfy all three requirements, and demonstrate how Mix-n-Match calibration strategies (i.e., ensemble and composition) can help achieve remarkably better data-efficiency and expressive power while provably maintaining the classification accuracy of the original classifier. Mix-n-Match strategies are generic in the sense that they can be used to improve the performance of any off-the-shelf calibrator. We also reveal potential issues in standard evaluation practices. Popular approaches (e.g., histogram-based expected calibration error (ECE)) may provide misleading results especially in small-data regime. Therefore, we propose an alternative data-efficient kernel density-based estimator for a reliable evaluation of the calibration performance and prove its asymptotically unbiasedness and consistency. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64- llnl/Mix-n-Match-Calibration.} }
Endnote
%0 Conference Paper %T Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning %A Jize Zhang %A Bhavya Kailkhura %A T. Yong-Jin Han %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-zhang20k %I PMLR %P 11117--11128 %U https://proceedings.mlr.press/v119/zhang20k.html %V 119 %X This paper studies the problem of post-hoc calibration of machine learning classifiers. We introduce the following desiderata for uncertainty calibration: (a) accuracy-preserving, (b) data-efficient, and (c) high expressive power. We show that none of the existing methods satisfy all three requirements, and demonstrate how Mix-n-Match calibration strategies (i.e., ensemble and composition) can help achieve remarkably better data-efficiency and expressive power while provably maintaining the classification accuracy of the original classifier. Mix-n-Match strategies are generic in the sense that they can be used to improve the performance of any off-the-shelf calibrator. We also reveal potential issues in standard evaluation practices. Popular approaches (e.g., histogram-based expected calibration error (ECE)) may provide misleading results especially in small-data regime. Therefore, we propose an alternative data-efficient kernel density-based estimator for a reliable evaluation of the calibration performance and prove its asymptotically unbiasedness and consistency. Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64- llnl/Mix-n-Match-Calibration.
APA
Zhang, J., Kailkhura, B. & Han, T.Y.. (2020). Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:11117-11128 Available from https://proceedings.mlr.press/v119/zhang20k.html.

Related Material