Explainable Medical Image Segmentation via Attention-Gated Fusion of Vision Transformers and U-Nets

Mahmoud Khalaf; Robin Cohen; Saidul Islam; Jamal Bentahar

Explainable Medical Image Segmentation via Attention-Gated Fusion of Vision Transformers and U-Nets

Mahmoud Khalaf, Robin Cohen, Saidul Islam, Jamal Bentahar

Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:272-283, 2026.

Abstract

Medical image segmentation is essential for assisting medical professionals in locating anomalies in images. The lack of explainability in current medical image segmentation frameworks demonstrates a gap in assisting clinicians in understanding how segmentation decisions are made, towards identifying the segmentation target. In this paper, we present a framework that offers an improved approach for assisting medical professionals in locating anomalies while providing visual explanations in the form of heatmaps of the target. We propose a dual encoder architecture using a U-Net encoder and Vision Transformer to perform accurate segmentation. We employ an attention fusion mechanism to fuse both encoder embeddings and generate an explainability heatmap that offers improved results for highlighting important features. We include discussion that reflects on the ways in which our approach advances the state of the art for medical decision making, in comparison with other current research, elaborating as well as on how the approach can be of value for distinct healthcare concerns. While our current results focus on how our dual encoder approach yields significant benefit, we also briefly discuss how to integrate textual explanations alongside, as a valued step forward for future work. Keywords: Explainable AI, Medical Applications of AI, Computer Vision Segmentation, AI for Social Good, Transformers, Attention.

Cite this Paper

BibTeX

@InProceedings{pmlr-v318-khalaf26a,
  title = 	 {Explainable Medical Image Segmentation via Attention-Gated Fusion of Vision Transformers and U-Nets},
  author =       {Khalaf, Mahmoud and Cohen, Robin and Islam, Saidul and Bentahar, Jamal},
  booktitle = 	 {Proceedings of the The 39th Canadian Conference on Artificial Intelligence},
  pages = 	 {272--283},
  year = 	 {2026},
  editor = 	 {Bouzar-Benlabiod, Lydia and Leung, Carson},
  volume = 	 {318},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {25--29 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v318/main/assets/khalaf26a/khalaf26a.pdf},
  url = 	 {https://proceedings.mlr.press/v318/khalaf26a.html},
  abstract = 	 {Medical image segmentation is essential for assisting medical professionals in locating anomalies in images. The lack of explainability in current medical image segmentation frameworks demonstrates a gap in assisting clinicians in understanding how segmentation decisions are made, towards identifying the segmentation target. In this paper, we present a framework that offers an improved approach for assisting medical professionals in locating anomalies while providing visual explanations in the form of heatmaps of the target. We propose a dual encoder architecture using a U-Net encoder and Vision Transformer to perform accurate segmentation. We employ an attention fusion mechanism to fuse both encoder embeddings and generate an explainability heatmap that offers improved results for highlighting important features. We include discussion that reflects on the ways in which our approach advances the state of the art for medical decision making, in comparison with other current research, elaborating as well as on how the approach can be of value for distinct healthcare concerns. While our current results focus on how our dual encoder approach yields significant benefit, we also briefly discuss how to integrate textual explanations alongside, as a valued step forward for future work. Keywords: Explainable AI, Medical Applications of AI, Computer Vision Segmentation, AI for Social Good, Transformers, Attention.}
}

Endnote

%0 Conference Paper
%T Explainable Medical Image Segmentation via Attention-Gated Fusion of Vision Transformers and U-Nets
%A Mahmoud Khalaf
%A Robin Cohen
%A Saidul Islam
%A Jamal Bentahar
%B Proceedings of the The 39th Canadian Conference on Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2026
%E Lydia Bouzar-Benlabiod
%E Carson Leung	
%F pmlr-v318-khalaf26a
%I PMLR
%P 272--283
%U https://proceedings.mlr.press/v318/khalaf26a.html
%V 318
%X Medical image segmentation is essential for assisting medical professionals in locating anomalies in images. The lack of explainability in current medical image segmentation frameworks demonstrates a gap in assisting clinicians in understanding how segmentation decisions are made, towards identifying the segmentation target. In this paper, we present a framework that offers an improved approach for assisting medical professionals in locating anomalies while providing visual explanations in the form of heatmaps of the target. We propose a dual encoder architecture using a U-Net encoder and Vision Transformer to perform accurate segmentation. We employ an attention fusion mechanism to fuse both encoder embeddings and generate an explainability heatmap that offers improved results for highlighting important features. We include discussion that reflects on the ways in which our approach advances the state of the art for medical decision making, in comparison with other current research, elaborating as well as on how the approach can be of value for distinct healthcare concerns. While our current results focus on how our dual encoder approach yields significant benefit, we also briefly discuss how to integrate textual explanations alongside, as a valued step forward for future work. Keywords: Explainable AI, Medical Applications of AI, Computer Vision Segmentation, AI for Social Good, Transformers, Attention.

APA

Khalaf, M., Cohen, R., Islam, S. & Bentahar, J.. (2026). Explainable Medical Image Segmentation via Attention-Gated Fusion of Vision Transformers and U-Nets. Proceedings of the The 39th Canadian Conference on Artificial Intelligence, in Proceedings of Machine Learning Research 318:272-283 Available from https://proceedings.mlr.press/v318/khalaf26a.html.

Related Material

Download PDF