PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion

Amar Kumar; Anita Kriz; Mohammad Havaei; Tal Arbel

PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion

Amar Kumar, Anita Kriz, Mohammad Havaei, Tal Arbel

Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:838-863, 2026.

Abstract

Developing reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, data imbalances, and limited text annotations in datasets. Addressing these challenges requires architectures robust to the unique complexities posed by medical imaging data. The rapid advancements in vision-language foundation models within the natural image domain prompt the question of how they can be adapted for medical imaging tasks. In this work, we present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals using Stable Diffusion. Our approach demonstrates unprecedented precision in selectively modifying spurious correlations (the medical devices) and disease features, enabling the removal and addition of specific attributes while preserving other image characteristics. Through extensive evaluation, we show how PRISM advances counterfactual generation and enables the development of more robust downstream classifiers for clinically deployable solutions. To facilitate broader adoption and research, we make our code publicly available at https://github.com/Amarkr1/PRISM.

Cite this Paper

BibTeX

@InProceedings{pmlr-v301-kumar26a,
  title = 	 {PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion},
  author =       {Kumar, Amar and Kriz, Anita and Havaei, Mohammad and Arbel, Tal},
  booktitle = 	 {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning},
  pages = 	 {838--863},
  year = 	 {2026},
  editor = 	 {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan},
  volume = 	 {301},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--11 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v301/main/assets/kumar26a/kumar26a.pdf},
  url = 	 {https://proceedings.mlr.press/v301/kumar26a.html},
  abstract = 	 {Developing reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, data imbalances, and limited text annotations in datasets. Addressing these challenges requires architectures robust to the unique complexities posed by medical imaging data. The rapid advancements in vision-language foundation models within the natural image domain prompt the question of how they can be adapted for medical imaging tasks. In this work, we present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals using Stable Diffusion. Our approach demonstrates unprecedented precision in selectively modifying spurious correlations (the medical devices) and disease features, enabling the removal and addition of specific attributes while preserving other image characteristics. Through extensive evaluation, we show how PRISM advances counterfactual generation and enables the development of more robust downstream classifiers for clinically deployable solutions. To facilitate broader adoption and research, we make our code publicly available at https://github.com/Amarkr1/PRISM.}
}

Endnote

%0 Conference Paper
%T PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion
%A Amar Kumar
%A Anita Kriz
%A Mohammad Havaei
%A Tal Arbel
%B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2026
%E Tolga Tasdizen
%E Shireen Elhabian
%E Ronald Summers
%E Chen Chen
%E Lisa Koch
%E Yan Zhuang	
%F pmlr-v301-kumar26a
%I PMLR
%P 838--863
%U https://proceedings.mlr.press/v301/kumar26a.html
%V 301
%X Developing reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, data imbalances, and limited text annotations in datasets. Addressing these challenges requires architectures robust to the unique complexities posed by medical imaging data. The rapid advancements in vision-language foundation models within the natural image domain prompt the question of how they can be adapted for medical imaging tasks. In this work, we present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals using Stable Diffusion. Our approach demonstrates unprecedented precision in selectively modifying spurious correlations (the medical devices) and disease features, enabling the removal and addition of specific attributes while preserving other image characteristics. Through extensive evaluation, we show how PRISM advances counterfactual generation and enables the development of more robust downstream classifiers for clinically deployable solutions. To facilitate broader adoption and research, we make our code publicly available at https://github.com/Amarkr1/PRISM.

APA

Kumar, A., Kriz, A., Havaei, M. & Arbel, T.. (2026). PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:838-863 Available from https://proceedings.mlr.press/v301/kumar26a.html.

Related Material

Download PDF