Pointwise sampling uncertainties on the Precision-Recall curve

Ralph E.Q. Urlus; Max Baak; Stéphane Collot; Ilan Fridman Rojas

Pointwise sampling uncertainties on the Precision-Recall curve

Ralph E.Q. Urlus, Max Baak, Stéphane Collot, Ilan Fridman Rojas

Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:8211-8232, 2023.

Abstract

Quoting robust uncertainties on machine learning (ML) model metrics, such as f1-score, precision, recall, etc., from sources of uncertainty such as data sampling, parameter initialization, and target labelling, is typically not done in the field of data science, even though these are essential for the proper interpretation and comparison of ML models. This text shows how to calculate and visualize the impact of one dominant source of uncertainty - on each point of the Precision-Recall (PR) and Receiver Operating Characteristic (ROC) curves. This is particularly relevant for PR curves, where the joint uncertainty on recall and precision can be large and non-linear, especially at low recall. Four statistical methods to evaluate this uncertainty, both frequentist and Bayesian in origin, are compared in terms of coverage and speed. Of these, Wilks’ toolbox.

Cite this Paper

BibTeX

@InProceedings{pmlr-v206-urlus23a,
  title = 	 {Pointwise sampling uncertainties on the Precision-Recall curve},
  author =       {Urlus, Ralph E.Q. and Baak, Max and Collot, St\'ephane and Fridman Rojas, Ilan},
  booktitle = 	 {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {8211--8232},
  year = 	 {2023},
  editor = 	 {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem},
  volume = 	 {206},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--27 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v206/urlus23a/urlus23a.pdf},
  url = 	 {https://proceedings.mlr.press/v206/urlus23a.html},
  abstract = 	 {Quoting robust uncertainties on machine learning (ML) model metrics, such as f1-score, precision, recall, etc., from sources of uncertainty such as data sampling, parameter initialization, and target labelling, is typically not done in the field of data science, even though these are essential for the proper interpretation and comparison of ML models. This text shows how to calculate and visualize the impact of one dominant source of uncertainty - on each point of the Precision-Recall (PR) and Receiver Operating Characteristic (ROC) curves. This is particularly relevant for PR curves, where the joint uncertainty on recall and precision can be large and non-linear, especially at low recall. Four statistical methods to evaluate this uncertainty, both frequentist and Bayesian in origin, are compared in terms of coverage and speed. Of these, Wilks’ toolbox.}
}

Endnote

%0 Conference Paper
%T Pointwise sampling uncertainties on the Precision-Recall curve
%A Ralph E.Q. Urlus
%A Max Baak
%A Stéphane Collot
%A Ilan Fridman Rojas
%B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2023
%E Francisco Ruiz
%E Jennifer Dy
%E Jan-Willem van de Meent	
%F pmlr-v206-urlus23a
%I PMLR
%P 8211--8232
%U https://proceedings.mlr.press/v206/urlus23a.html
%V 206
%X Quoting robust uncertainties on machine learning (ML) model metrics, such as f1-score, precision, recall, etc., from sources of uncertainty such as data sampling, parameter initialization, and target labelling, is typically not done in the field of data science, even though these are essential for the proper interpretation and comparison of ML models. This text shows how to calculate and visualize the impact of one dominant source of uncertainty - on each point of the Precision-Recall (PR) and Receiver Operating Characteristic (ROC) curves. This is particularly relevant for PR curves, where the joint uncertainty on recall and precision can be large and non-linear, especially at low recall. Four statistical methods to evaluate this uncertainty, both frequentist and Bayesian in origin, are compared in terms of coverage and speed. Of these, Wilks’ toolbox.

APA

Urlus, R.E., Baak, M., Collot, S. & Fridman Rojas, I.. (2023). Pointwise sampling uncertainties on the Precision-Recall curve. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:8211-8232 Available from https://proceedings.mlr.press/v206/urlus23a.html.

Pointwise sampling uncertainties on the Precision-Recall curve

Abstract

Cite this Paper

Related Material