Evaluating Age-Related Anatomical Consistency in Synthetic Brain MRI against Real-World Alzheimer’s Disease Data.

Hadya Yassin; Jana Fehr; Wei-Cheng Lai; Alina Krichevsky; Alexander Rakowski; Christoph Lippert

Evaluating Age-Related Anatomical Consistency in Synthetic Brain MRI against Real-World Alzheimer’s Disease Data.

Hadya Yassin, Jana Fehr, Wei-Cheng Lai, Alina Krichevsky, Alexander Rakowski, Christoph Lippert

Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, PMLR 250:1801-1822, 2024.

Abstract

This study examines the realism of medical images created with deep generative models, specifically their replication of aging and Alzheimerś disease (AD) related anatomical changes. Previous research focused on developing generative methods with limited attention to image fidelity. We aim to assess the resemblance of brain MRI generated by a StyleGAN3 model with causal controls to neurodegenerative changes. For a benchmark, we conducted a visual Turing test (VTT) to see if radiologists could distinguish between synthetic and real images. Then, we employed a U-Net-based model to segment hallmarks relevant to normal aging and (AD). Finally, we conducted statistical tests for our hypothesis that no significant differences existed between real and synthetic images. (VTT) results showed radiologists struggled to differentiate between image types, highlighting (VTT)ś limitations due to subjectivity and time constraints. We found slight hippocampus distribution differences ($\textit{P}$ = 5.7e-2) and significant lateral ventricle discrepancies ($\textit{P}$s $<$ 5.0e-2), indicating higher hippocampus realism and ventricle size inconsistencies. The model more effectively simulated changes in the hippocampus than in the lateral ventricles, where difficulties were encountered with certain subgroups. We conclude that the (VTT) alone is inadequate for a comprehensive quality evaluation, promoting a more objective approach. Future research could adapt our approach to evaluate other generated medical images intended for different downstream tasks. For reproducibility, we provide detailed code implementation$^1$.

Cite this Paper

BibTeX

@InProceedings{pmlr-v250-yassin24a,
  title = 	 {Evaluating Age-Related Anatomical Consistency in Synthetic Brain MRI against Real-World Alzheimer’s Disease Data.},
  author =       {Yassin, Hadya and Fehr, Jana and Lai, Wei-Cheng and Krichevsky, Alina and Rakowski, Alexander and Lippert, Christoph},
  booktitle = 	 {Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning},
  pages = 	 {1801--1822},
  year = 	 {2024},
  editor = 	 {Burgos, Ninon and Petitjean, Caroline and Vakalopoulou, Maria and Christodoulidis, Stergios and Coupe, Pierrick and Delingette, Hervé and Lartizien, Carole and Mateus, Diana},
  volume = 	 {250},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--05 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v250/main/assets/yassin24a/yassin24a.pdf},
  url = 	 {https://proceedings.mlr.press/v250/yassin24a.html},
  abstract = 	 {This study examines the realism of medical images created with deep generative models, specifically their replication of aging and Alzheimerś disease (AD) related anatomical changes. Previous research focused on developing generative methods with limited attention to image fidelity. We aim to assess the resemblance of brain MRI generated by a StyleGAN3 model with causal controls to neurodegenerative changes. For a benchmark, we conducted a visual Turing test (VTT) to see if radiologists could distinguish between synthetic and real images. Then, we employed a U-Net-based model to segment hallmarks relevant to normal aging and (AD). Finally, we conducted statistical tests for our hypothesis that no significant differences existed between real and synthetic images. (VTT) results showed radiologists struggled to differentiate between image types, highlighting (VTT)ś limitations due to subjectivity and time constraints. We found slight hippocampus distribution differences ($\textit{P}$ = 5.7e-2) and significant lateral ventricle discrepancies ($\textit{P}$s $<$ 5.0e-2), indicating higher hippocampus realism and ventricle size inconsistencies. The model more effectively simulated changes in the hippocampus than in the lateral ventricles, where difficulties were encountered with certain subgroups. We conclude that the (VTT) alone is inadequate for a comprehensive quality evaluation, promoting a more objective approach. Future research could adapt our approach to evaluate other generated medical images intended for different downstream tasks. For reproducibility, we provide detailed code implementation$^1$.}
}

Endnote

%0 Conference Paper
%T Evaluating Age-Related Anatomical Consistency in Synthetic Brain MRI against Real-World Alzheimer’s Disease Data.
%A Hadya Yassin
%A Jana Fehr
%A Wei-Cheng Lai
%A Alina Krichevsky
%A Alexander Rakowski
%A Christoph Lippert
%B Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ninon Burgos
%E Caroline Petitjean
%E Maria Vakalopoulou
%E Stergios Christodoulidis
%E Pierrick Coupe
%E Hervé Delingette
%E Carole Lartizien
%E Diana Mateus	
%F pmlr-v250-yassin24a
%I PMLR
%P 1801--1822
%U https://proceedings.mlr.press/v250/yassin24a.html
%V 250
%X This study examines the realism of medical images created with deep generative models, specifically their replication of aging and Alzheimerś disease (AD) related anatomical changes. Previous research focused on developing generative methods with limited attention to image fidelity. We aim to assess the resemblance of brain MRI generated by a StyleGAN3 model with causal controls to neurodegenerative changes. For a benchmark, we conducted a visual Turing test (VTT) to see if radiologists could distinguish between synthetic and real images. Then, we employed a U-Net-based model to segment hallmarks relevant to normal aging and (AD). Finally, we conducted statistical tests for our hypothesis that no significant differences existed between real and synthetic images. (VTT) results showed radiologists struggled to differentiate between image types, highlighting (VTT)ś limitations due to subjectivity and time constraints. We found slight hippocampus distribution differences ($\textit{P}$ = 5.7e-2) and significant lateral ventricle discrepancies ($\textit{P}$s $<$ 5.0e-2), indicating higher hippocampus realism and ventricle size inconsistencies. The model more effectively simulated changes in the hippocampus than in the lateral ventricles, where difficulties were encountered with certain subgroups. We conclude that the (VTT) alone is inadequate for a comprehensive quality evaluation, promoting a more objective approach. Future research could adapt our approach to evaluate other generated medical images intended for different downstream tasks. For reproducibility, we provide detailed code implementation$^1$.

APA

Yassin, H., Fehr, J., Lai, W., Krichevsky, A., Rakowski, A. & Lippert, C.. (2024). Evaluating Age-Related Anatomical Consistency in Synthetic Brain MRI against Real-World Alzheimer’s Disease Data.. Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 250:1801-1822 Available from https://proceedings.mlr.press/v250/yassin24a.html.

Related Material

Download PDF