Multi-centric Comparison of Deep Learning Models for Lesion Detection in Breast MRI

Kai Geißler; Markus Wenzel; Susanne Diekmann; Heinrich von Busch; Robert Grimm; Hans Meine

Multi-centric Comparison of Deep Learning Models for Lesion Detection in Breast MRI

Kai Geißler, Markus Wenzel, Susanne Diekmann, Heinrich von Busch, Robert Grimm, Hans Meine

Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:458-474, 2026.

Abstract

Breast magnetic resonance imaging (MRI) is a common modality for diagnostic imaging in breast cancer, creating a need for automated image analysis to assist in early detection and diagnosis.In this study, we compare multiple deep learning-based segmentation and detection algorithms for lesion detection in dynamic contrast-enhanced (DCE) breast MRI. We utilized a large multi-centric dataset comprising T1-weighted DCE MR images from nine clinical sites across seven countries, encompassing diverse imaging characteristics and scanner types. We evaluated several models, including the standard nnU-Net, an adapted nnU-Net with modifications to reduce false positives, a coarse-resolution version thereof, the transformer-based SwinUNETR-V2, and nnDetection.The standard nnU-Net achieved a high lesion-level sensitivity of 83.8% but produced an average of 3.334 false positives per case, which is impractical for clinical use. The adapted (coarse) nnU-Net significantly reduced false positives to 0.666 (0.397) per case with a slight decrease in sensitivity to 79.9% (75.8%). SwinUNETR-V2 achieved comparable performance to the adapted nnU-Net. nnDetection outperformed nnU-Net in the high-sensitivity region, but performed worse than the adapted models in the lower-sensitivity region, with respect to false positives. To conclude, the nnU-Net again provides a good baseline, but our lesion detection task motivates adaptations to reduce the number of false positives.

Cite this Paper

BibTeX

@InProceedings{pmlr-v301-geissler26a,
  title = 	 {Multi-centric Comparison of Deep Learning Models for Lesion Detection in Breast MRI},
  author =       {Gei{\ss}ler, Kai and Wenzel, Markus and Diekmann, Susanne and von Busch, Heinrich and Grimm, Robert and Meine, Hans},
  booktitle = 	 {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning},
  pages = 	 {458--474},
  year = 	 {2026},
  editor = 	 {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan},
  volume = 	 {301},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--11 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v301/main/assets/geissler26a/geissler26a.pdf},
  url = 	 {https://proceedings.mlr.press/v301/geissler26a.html},
  abstract = 	 {Breast magnetic resonance imaging (MRI) is a common modality for diagnostic imaging in breast cancer, creating a need for automated image analysis to assist in early detection and diagnosis.In this study, we compare multiple deep learning-based segmentation and detection algorithms for lesion detection in dynamic contrast-enhanced (DCE) breast MRI. We utilized a large multi-centric dataset comprising T1-weighted DCE MR images from nine clinical sites across seven countries, encompassing diverse imaging characteristics and scanner types. We evaluated several models, including the standard nnU-Net, an adapted nnU-Net with modifications to reduce false positives, a coarse-resolution version thereof, the transformer-based SwinUNETR-V2, and nnDetection.The standard nnU-Net achieved a high lesion-level sensitivity of 83.8% but produced an average of 3.334 false positives per case, which is impractical for clinical use. The adapted (coarse) nnU-Net significantly reduced false positives to 0.666 (0.397) per case with a slight decrease in sensitivity to 79.9% (75.8%). SwinUNETR-V2 achieved comparable performance to the adapted nnU-Net. nnDetection outperformed nnU-Net in the high-sensitivity region, but performed worse than the adapted models in the lower-sensitivity region, with respect to false positives. To conclude, the nnU-Net again provides a good baseline, but our lesion detection task motivates adaptations to reduce the number of false positives.}
}

Endnote

%0 Conference Paper
%T Multi-centric Comparison of Deep Learning Models for Lesion Detection in Breast MRI
%A Kai Geißler
%A Markus Wenzel
%A Susanne Diekmann
%A Heinrich von Busch
%A Robert Grimm
%A Hans Meine
%B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2026
%E Tolga Tasdizen
%E Shireen Elhabian
%E Ronald Summers
%E Chen Chen
%E Lisa Koch
%E Yan Zhuang	
%F pmlr-v301-geissler26a
%I PMLR
%P 458--474
%U https://proceedings.mlr.press/v301/geissler26a.html
%V 301
%X Breast magnetic resonance imaging (MRI) is a common modality for diagnostic imaging in breast cancer, creating a need for automated image analysis to assist in early detection and diagnosis.In this study, we compare multiple deep learning-based segmentation and detection algorithms for lesion detection in dynamic contrast-enhanced (DCE) breast MRI. We utilized a large multi-centric dataset comprising T1-weighted DCE MR images from nine clinical sites across seven countries, encompassing diverse imaging characteristics and scanner types. We evaluated several models, including the standard nnU-Net, an adapted nnU-Net with modifications to reduce false positives, a coarse-resolution version thereof, the transformer-based SwinUNETR-V2, and nnDetection.The standard nnU-Net achieved a high lesion-level sensitivity of 83.8% but produced an average of 3.334 false positives per case, which is impractical for clinical use. The adapted (coarse) nnU-Net significantly reduced false positives to 0.666 (0.397) per case with a slight decrease in sensitivity to 79.9% (75.8%). SwinUNETR-V2 achieved comparable performance to the adapted nnU-Net. nnDetection outperformed nnU-Net in the high-sensitivity region, but performed worse than the adapted models in the lower-sensitivity region, with respect to false positives. To conclude, the nnU-Net again provides a good baseline, but our lesion detection task motivates adaptations to reduce the number of false positives.

APA

Geißler, K., Wenzel, M., Diekmann, S., von Busch, H., Grimm, R. & Meine, H.. (2026). Multi-centric Comparison of Deep Learning Models for Lesion Detection in Breast MRI. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:458-474 Available from https://proceedings.mlr.press/v301/geissler26a.html.

Related Material

Download PDF