XIME3D: A Systematic Framework for Evaluating Explainable AI in 3D Medical Imaging under CT Image Pre-Processing Variations

Gizem Karagoz; Tanir Ozcelebi; Nirvana Meratnia

XIME3D: A Systematic Framework for Evaluating Explainable AI in 3D Medical Imaging under CT Image Pre-Processing Variations

Gizem Karagoz, Tanir Ozcelebi, Nirvana Meratnia

Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, PMLR 317:159-168, 2026.

Abstract

Recent advancements in deep learning have enabled expert-level performance in medical imaging for disease classification, but their black-box decision making processes limit trust in them and their wide-spread clinical deployment. While Explainable Artificial Intelligence (XAI) methods aim to bridge this gap, studies focus on 2D data or pre-processed research datasets that overlook the role of medical imaging pre-processing operations which is an essential component of real-world 3D medical imaging workflows. To address this limitation, we propose XIME3D, a systematic and predictive model–centered framework for evaluating explainability under realistic medical pre-processing conditions for volumetric medical data. The framework integrates five volumetric pre-processing variants and ten post-hoc attribution methods, evaluated through three complementary criteria: Correctness, Contrastivity, and Completeness, which together evaluate explanation dependence on model input, internal structure, and output behavior. Across more than 300 experimental configurations, XIME3D reveals that gradient-based methods, such as Integrated Gradients and Blur Integrated Gradients, provide the most consistent and model-aligned explanations, while noise-based approaches like SmoothGrad and VarGrad are less sensitive to model behavior. These findings emphasize the importance of clinically realistic evaluation pipelines for reliable explainability in 3D medical imaging.

Cite this Paper

BibTeX

@InProceedings{pmlr-v317-karagoz26a,
  title = 	 {XIME3D: A Systematic Framework for Evaluating Explainable AI in 3D Medical Imaging under CT Image Pre-Processing Variations},
  author =       {Karagoz, Gizem and Ozcelebi, Tanir and Meratnia, Nirvana},
  booktitle = 	 {Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare},
  pages = 	 {159--168},
  year = 	 {2026},
  editor = 	 {Wu, Junde and Pan, Jiazhen and Zhu, Jiayuan and Luo, Luyang and Li, Yitong and Xu, Min and Jin, Yueming and Rueckert, Daniel},
  volume = 	 {317},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--21 Jan},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v317/main/assets/karagoz26a/karagoz26a.pdf},
  url = 	 {https://proceedings.mlr.press/v317/karagoz26a.html},
  abstract = 	 {Recent advancements in deep learning have enabled expert-level performance in medical imaging for disease classification, but their black-box decision making processes limit trust in them and their wide-spread clinical deployment. While Explainable Artificial Intelligence (XAI) methods aim to bridge this gap, studies focus on 2D data or pre-processed research datasets that overlook the role of medical imaging pre-processing operations which is an essential component of real-world 3D medical imaging workflows. To address this limitation, we propose XIME3D, a systematic and predictive model–centered framework for evaluating explainability under realistic medical pre-processing conditions for volumetric medical data. The framework integrates five volumetric pre-processing variants and ten post-hoc attribution methods, evaluated through three complementary criteria: Correctness, Contrastivity, and Completeness, which together evaluate explanation dependence on model input, internal structure, and output behavior. Across more than 300 experimental configurations, XIME3D reveals that gradient-based methods, such as Integrated Gradients and Blur Integrated Gradients, provide the most consistent and model-aligned explanations, while noise-based approaches like SmoothGrad and VarGrad are less sensitive to model behavior. These findings emphasize the importance of clinically realistic evaluation pipelines for reliable explainability in 3D medical imaging.}
}

Endnote

%0 Conference Paper
%T XIME3D: A Systematic Framework for Evaluating Explainable AI in 3D Medical Imaging under CT Image Pre-Processing Variations
%A Gizem Karagoz
%A Tanir Ozcelebi
%A Nirvana Meratnia
%B Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare
%C Proceedings of Machine Learning Research
%D 2026
%E Junde Wu
%E Jiazhen Pan
%E Jiayuan Zhu
%E Luyang Luo
%E Yitong Li
%E Min Xu
%E Yueming Jin
%E Daniel Rueckert	
%F pmlr-v317-karagoz26a
%I PMLR
%P 159--168
%U https://proceedings.mlr.press/v317/karagoz26a.html
%V 317
%X Recent advancements in deep learning have enabled expert-level performance in medical imaging for disease classification, but their black-box decision making processes limit trust in them and their wide-spread clinical deployment. While Explainable Artificial Intelligence (XAI) methods aim to bridge this gap, studies focus on 2D data or pre-processed research datasets that overlook the role of medical imaging pre-processing operations which is an essential component of real-world 3D medical imaging workflows. To address this limitation, we propose XIME3D, a systematic and predictive model–centered framework for evaluating explainability under realistic medical pre-processing conditions for volumetric medical data. The framework integrates five volumetric pre-processing variants and ten post-hoc attribution methods, evaluated through three complementary criteria: Correctness, Contrastivity, and Completeness, which together evaluate explanation dependence on model input, internal structure, and output behavior. Across more than 300 experimental configurations, XIME3D reveals that gradient-based methods, such as Integrated Gradients and Blur Integrated Gradients, provide the most consistent and model-aligned explanations, while noise-based approaches like SmoothGrad and VarGrad are less sensitive to model behavior. These findings emphasize the importance of clinically realistic evaluation pipelines for reliable explainability in 3D medical imaging.

APA

Karagoz, G., Ozcelebi, T. & Meratnia, N.. (2026). XIME3D: A Systematic Framework for Evaluating Explainable AI in 3D Medical Imaging under CT Image Pre-Processing Variations. Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, in Proceedings of Machine Learning Research 317:159-168 Available from https://proceedings.mlr.press/v317/karagoz26a.html.

Related Material

Download PDF