Evaluating different approaches to calibrating conformal predictive systems

Hugo Werner, Lars Carlsson, Ernst Ahlberg, Henrik Boström.
Proceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications, PMLR 128:134-150, 2020.

Abstract

Conformal predictive systems (CPSs) provide probability distributions for real-valued labels of test examples, rather than point predictions (as output by regular regression models) or confidence intervals (as output by conformal regressors). The performance of a CPS is dependent on both the underlying model and the way in which the quality of its predictions is estimated; a stronger underlying model and a better quality estimation can significantly improve the performance. Recent studies have shown that conformal regressors that use random forests as the underlying model may benefit from using out-of-bag predictions for the calibration, rather than setting aside a separate calibration set, allowing for more data to be used for training and thereby improving the performance of the underlying model. These studies have furthermore shown that the quality of the individual predictions can be effectively estimated using the variance of the predictions or by k-nearest-neighbor models trained on the prediction errors. It is here investigated whether these methods are also effective in the context of split conformal predictive systems. Results from a large empirical study are presented, using 33 publicly available datasets. The results show that by using either variance or the k-nearest-neighbor method for estimating prediction quality, a significant increase in performance, as measured by the continuous ranked probability score, can be obtained compared to omitting the quality estimation. The results furthermore show that the use of out-of-bag examples for calibration is competitive with the most effective way of splitting training data into a proper training set and a calibration set, without requiring tuning of the calibration set size.

Cite this Paper


BibTeX
@InProceedings{pmlr-v128-werner20a, title = {Evaluating different approaches to calibrating conformal predictive systems}, author = {Werner, Hugo and Carlsson, Lars and Ahlberg, Ernst and Bostr\"{o}m., Henrik}, booktitle = {Proceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications}, pages = {134--150}, year = {2020}, editor = {Gammerman, Alexander and Vovk, Vladimir and Luo, Zhiyuan and Smirnov, Evgueni and Cherubin, Giovanni}, volume = {128}, series = {Proceedings of Machine Learning Research}, month = {09--11 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v128/werner20a/werner20a.pdf}, url = {https://proceedings.mlr.press/v128/werner20a.html}, abstract = {Conformal predictive systems (CPSs) provide probability distributions for real-valued labels of test examples, rather than point predictions (as output by regular regression models) or confidence intervals (as output by conformal regressors). The performance of a CPS is dependent on both the underlying model and the way in which the quality of its predictions is estimated; a stronger underlying model and a better quality estimation can significantly improve the performance. Recent studies have shown that conformal regressors that use random forests as the underlying model may benefit from using out-of-bag predictions for the calibration, rather than setting aside a separate calibration set, allowing for more data to be used for training and thereby improving the performance of the underlying model. These studies have furthermore shown that the quality of the individual predictions can be effectively estimated using the variance of the predictions or by k-nearest-neighbor models trained on the prediction errors. It is here investigated whether these methods are also effective in the context of split conformal predictive systems. Results from a large empirical study are presented, using 33 publicly available datasets. The results show that by using either variance or the k-nearest-neighbor method for estimating prediction quality, a significant increase in performance, as measured by the continuous ranked probability score, can be obtained compared to omitting the quality estimation. The results furthermore show that the use of out-of-bag examples for calibration is competitive with the most effective way of splitting training data into a proper training set and a calibration set, without requiring tuning of the calibration set size.} }
Endnote
%0 Conference Paper %T Evaluating different approaches to calibrating conformal predictive systems %A Hugo Werner %A Lars Carlsson %A Ernst Ahlberg %A Henrik Boström. %B Proceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications %C Proceedings of Machine Learning Research %D 2020 %E Alexander Gammerman %E Vladimir Vovk %E Zhiyuan Luo %E Evgueni Smirnov %E Giovanni Cherubin %F pmlr-v128-werner20a %I PMLR %P 134--150 %U https://proceedings.mlr.press/v128/werner20a.html %V 128 %X Conformal predictive systems (CPSs) provide probability distributions for real-valued labels of test examples, rather than point predictions (as output by regular regression models) or confidence intervals (as output by conformal regressors). The performance of a CPS is dependent on both the underlying model and the way in which the quality of its predictions is estimated; a stronger underlying model and a better quality estimation can significantly improve the performance. Recent studies have shown that conformal regressors that use random forests as the underlying model may benefit from using out-of-bag predictions for the calibration, rather than setting aside a separate calibration set, allowing for more data to be used for training and thereby improving the performance of the underlying model. These studies have furthermore shown that the quality of the individual predictions can be effectively estimated using the variance of the predictions or by k-nearest-neighbor models trained on the prediction errors. It is here investigated whether these methods are also effective in the context of split conformal predictive systems. Results from a large empirical study are presented, using 33 publicly available datasets. The results show that by using either variance or the k-nearest-neighbor method for estimating prediction quality, a significant increase in performance, as measured by the continuous ranked probability score, can be obtained compared to omitting the quality estimation. The results furthermore show that the use of out-of-bag examples for calibration is competitive with the most effective way of splitting training data into a proper training set and a calibration set, without requiring tuning of the calibration set size.
APA
Werner, H., Carlsson, L., Ahlberg, E. & Boström., H.. (2020). Evaluating different approaches to calibrating conformal predictive systems. Proceedings of the Ninth Symposium on Conformal and Probabilistic Prediction and Applications, in Proceedings of Machine Learning Research 128:134-150 Available from https://proceedings.mlr.press/v128/werner20a.html.

Related Material