Detecting and mitigating issues in image-based COVID-19 diagnosis

João Marcos Cardoso da Silva, Pedro Martelleto Bressane Rezende, Moacir Antonelli Ponti
Proceedings of the 1st Workshop on Healthcare AI and COVID-19, ICML 2022, PMLR 184:127-135, 2022.

Abstract

As urgency over the coronavirus disease 2019 (COVID-19) increased, many datasets with chest radiography (CXR) and chest computed tomography (CT) images emerged aiming at the detection and prognosis of COVID-19. Over the last two years, thousands of studies have been published, reporting promising results. However, a deeper analysis of the datasets and the methods employed reveals issues that may hamper conclusions and practical applicability. We investigate three major datasets commonly used in these studies, detect problems related to the existence of duplicates, address the specificity of classes within those datasets, and propose a way to perform external validation via cross-dataset evaluation. Our guidelines and findings contribute towards a trust-worthy application of Machine Learning in the context of image-based diagnosis, as well as offer a more accurate assessment of models applied to the prognostication of diseases using image datasets and pave the way towards models that can be relied upon in the real world.

Cite this Paper


BibTeX
@InProceedings{pmlr-v184-silva22a, title = {Detecting and mitigating issues in image-based COVID-19 diagnosis}, author = {Cardoso da Silva, Jo\~ao Marcos and Martelleto Bressane Rezende, Pedro and Antonelli Ponti, Moacir}, booktitle = {Proceedings of the 1st Workshop on Healthcare AI and COVID-19, ICML 2022}, pages = {127--135}, year = {2022}, editor = {Xu, Peng and Zhu, Tingting and Zhu, Pengkai and Clifton, David A. and Belgrave, Danielle and Zhang, Yuanting}, volume = {184}, series = {Proceedings of Machine Learning Research}, month = {22 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v184/silva22a/silva22a.pdf}, url = {https://proceedings.mlr.press/v184/silva22a.html}, abstract = {As urgency over the coronavirus disease 2019 (COVID-19) increased, many datasets with chest radiography (CXR) and chest computed tomography (CT) images emerged aiming at the detection and prognosis of COVID-19. Over the last two years, thousands of studies have been published, reporting promising results. However, a deeper analysis of the datasets and the methods employed reveals issues that may hamper conclusions and practical applicability. We investigate three major datasets commonly used in these studies, detect problems related to the existence of duplicates, address the specificity of classes within those datasets, and propose a way to perform external validation via cross-dataset evaluation. Our guidelines and findings contribute towards a trust-worthy application of Machine Learning in the context of image-based diagnosis, as well as offer a more accurate assessment of models applied to the prognostication of diseases using image datasets and pave the way towards models that can be relied upon in the real world.} }
Endnote
%0 Conference Paper %T Detecting and mitigating issues in image-based COVID-19 diagnosis %A João Marcos Cardoso da Silva %A Pedro Martelleto Bressane Rezende %A Moacir Antonelli Ponti %B Proceedings of the 1st Workshop on Healthcare AI and COVID-19, ICML 2022 %C Proceedings of Machine Learning Research %D 2022 %E Peng Xu %E Tingting Zhu %E Pengkai Zhu %E David A. Clifton %E Danielle Belgrave %E Yuanting Zhang %F pmlr-v184-silva22a %I PMLR %P 127--135 %U https://proceedings.mlr.press/v184/silva22a.html %V 184 %X As urgency over the coronavirus disease 2019 (COVID-19) increased, many datasets with chest radiography (CXR) and chest computed tomography (CT) images emerged aiming at the detection and prognosis of COVID-19. Over the last two years, thousands of studies have been published, reporting promising results. However, a deeper analysis of the datasets and the methods employed reveals issues that may hamper conclusions and practical applicability. We investigate three major datasets commonly used in these studies, detect problems related to the existence of duplicates, address the specificity of classes within those datasets, and propose a way to perform external validation via cross-dataset evaluation. Our guidelines and findings contribute towards a trust-worthy application of Machine Learning in the context of image-based diagnosis, as well as offer a more accurate assessment of models applied to the prognostication of diseases using image datasets and pave the way towards models that can be relied upon in the real world.
APA
Cardoso da Silva, J.M., Martelleto Bressane Rezende, P. & Antonelli Ponti, M.. (2022). Detecting and mitigating issues in image-based COVID-19 diagnosis. Proceedings of the 1st Workshop on Healthcare AI and COVID-19, ICML 2022, in Proceedings of Machine Learning Research 184:127-135 Available from https://proceedings.mlr.press/v184/silva22a.html.

Related Material