Calibrating multi-class models

Ulf Johansson, Tuwe Löfström, Henrik Boström
Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR 152:111-130, 2021.

Abstract

Predictive models communicating algorithmic confidence are very informative, but only if well-calibrated and sharp, i.e., providing accurate probability estimates adjusted for each instance. While almost all machine learning algorithms are able to produce probability estimates, these are often poorly calibrated, thus requiring external calibration. For multiclass problems, external calibration has typically been done using one-vs-all or all-vs-all schemes, thus adding to the computational complexity, but also making it impossible to analyze and inspect the predictive models. In this paper, we suggest a novel approach for calibrating inherently multi-class models. Instead of providing a probability distribution over all labels, the estimation is of the probability that the class label predicted by the underlying model is correct. In an extensive empirical study, it is shown that the suggested approach, when applied to both Platt scaling and Venn-Abers, is able to improve the probability estimates from decision trees, random forests and extreme gradient boosting.

Cite this Paper


BibTeX
@InProceedings{pmlr-v152-johansson21a, title = {Calibrating multi-class models}, author = {Johansson, Ulf and L\"{o}fstr\"om, Tuwe and Bostr\"om, Henrik}, booktitle = {Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications}, pages = {111--130}, year = {2021}, editor = {Carlsson, Lars and Luo, Zhiyuan and Cherubin, Giovanni and An Nguyen, Khuong}, volume = {152}, series = {Proceedings of Machine Learning Research}, month = {08--10 Sep}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v152/johansson21a/johansson21a.pdf}, url = {https://proceedings.mlr.press/v152/johansson21a.html}, abstract = {Predictive models communicating algorithmic confidence are very informative, but only if well-calibrated and sharp, i.e., providing accurate probability estimates adjusted for each instance. While almost all machine learning algorithms are able to produce probability estimates, these are often poorly calibrated, thus requiring external calibration. For multiclass problems, external calibration has typically been done using one-vs-all or all-vs-all schemes, thus adding to the computational complexity, but also making it impossible to analyze and inspect the predictive models. In this paper, we suggest a novel approach for calibrating inherently multi-class models. Instead of providing a probability distribution over all labels, the estimation is of the probability that the class label predicted by the underlying model is correct. In an extensive empirical study, it is shown that the suggested approach, when applied to both Platt scaling and Venn-Abers, is able to improve the probability estimates from decision trees, random forests and extreme gradient boosting.} }
Endnote
%0 Conference Paper %T Calibrating multi-class models %A Ulf Johansson %A Tuwe Löfström %A Henrik Boström %B Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications %C Proceedings of Machine Learning Research %D 2021 %E Lars Carlsson %E Zhiyuan Luo %E Giovanni Cherubin %E Khuong An Nguyen %F pmlr-v152-johansson21a %I PMLR %P 111--130 %U https://proceedings.mlr.press/v152/johansson21a.html %V 152 %X Predictive models communicating algorithmic confidence are very informative, but only if well-calibrated and sharp, i.e., providing accurate probability estimates adjusted for each instance. While almost all machine learning algorithms are able to produce probability estimates, these are often poorly calibrated, thus requiring external calibration. For multiclass problems, external calibration has typically been done using one-vs-all or all-vs-all schemes, thus adding to the computational complexity, but also making it impossible to analyze and inspect the predictive models. In this paper, we suggest a novel approach for calibrating inherently multi-class models. Instead of providing a probability distribution over all labels, the estimation is of the probability that the class label predicted by the underlying model is correct. In an extensive empirical study, it is shown that the suggested approach, when applied to both Platt scaling and Venn-Abers, is able to improve the probability estimates from decision trees, random forests and extreme gradient boosting.
APA
Johansson, U., Löfström, T. & Boström, H.. (2021). Calibrating multi-class models. Proceedings of the Tenth Symposium on Conformal and Probabilistic Prediction and Applications, in Proceedings of Machine Learning Research 152:111-130 Available from https://proceedings.mlr.press/v152/johansson21a.html.

Related Material