Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification

Martin Mihelich, François Castagnos, Charles Dognin
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:35639-35646, 2024.

Abstract

In this paper, we present two key theorems that should have significant implications for machine learning practitioners working with binary classification models. The first theorem provides a formula to calculate the maximum and minimum Precision-Recall AUC ($AUC_{PR}$) for a fixed Receiver Operating Characteristic AUC ($AUC_{ROC}$), demonstrating the variability of $AUC_{PR}$ even with a high $AUC_{ROC}$. This is particularly relevant for imbalanced datasets, where a good $AUC_{ROC}$ does not necessarily imply a high $AUC_{PR}$. The second theorem inversely establishes the bounds of $AUC_{ROC}$ given a fixed $AUC_{PR}$. Our findings highlight that in certain situations, especially for imbalanced datasets, it is more informative to prioritize $AUC_{PR}$ over $AUC_{ROC}$. Additionally, we introduce a method to determine when a higher $AUC_{ROC}$ in one model implies a higher $AUC_{PR}$ in another and vice versa, streamlining the model evaluation process.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-mihelich24a, title = {Interplay of {ROC} and Precision-Recall {AUC}s: Theoretical Limits and Practical Implications in Binary Classification}, author = {Mihelich, Martin and Castagnos, Fran\c{c}ois and Dognin, Charles}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {35639--35646}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/mihelich24a/mihelich24a.pdf}, url = {https://proceedings.mlr.press/v235/mihelich24a.html}, abstract = {In this paper, we present two key theorems that should have significant implications for machine learning practitioners working with binary classification models. The first theorem provides a formula to calculate the maximum and minimum Precision-Recall AUC ($AUC_{PR}$) for a fixed Receiver Operating Characteristic AUC ($AUC_{ROC}$), demonstrating the variability of $AUC_{PR}$ even with a high $AUC_{ROC}$. This is particularly relevant for imbalanced datasets, where a good $AUC_{ROC}$ does not necessarily imply a high $AUC_{PR}$. The second theorem inversely establishes the bounds of $AUC_{ROC}$ given a fixed $AUC_{PR}$. Our findings highlight that in certain situations, especially for imbalanced datasets, it is more informative to prioritize $AUC_{PR}$ over $AUC_{ROC}$. Additionally, we introduce a method to determine when a higher $AUC_{ROC}$ in one model implies a higher $AUC_{PR}$ in another and vice versa, streamlining the model evaluation process.} }
Endnote
%0 Conference Paper %T Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification %A Martin Mihelich %A François Castagnos %A Charles Dognin %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-mihelich24a %I PMLR %P 35639--35646 %U https://proceedings.mlr.press/v235/mihelich24a.html %V 235 %X In this paper, we present two key theorems that should have significant implications for machine learning practitioners working with binary classification models. The first theorem provides a formula to calculate the maximum and minimum Precision-Recall AUC ($AUC_{PR}$) for a fixed Receiver Operating Characteristic AUC ($AUC_{ROC}$), demonstrating the variability of $AUC_{PR}$ even with a high $AUC_{ROC}$. This is particularly relevant for imbalanced datasets, where a good $AUC_{ROC}$ does not necessarily imply a high $AUC_{PR}$. The second theorem inversely establishes the bounds of $AUC_{ROC}$ given a fixed $AUC_{PR}$. Our findings highlight that in certain situations, especially for imbalanced datasets, it is more informative to prioritize $AUC_{PR}$ over $AUC_{ROC}$. Additionally, we introduce a method to determine when a higher $AUC_{ROC}$ in one model implies a higher $AUC_{PR}$ in another and vice versa, streamlining the model evaluation process.
APA
Mihelich, M., Castagnos, F. & Dognin, C.. (2024). Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:35639-35646 Available from https://proceedings.mlr.press/v235/mihelich24a.html.

Related Material