Slide-SAM: Medical SAM Meets Sliding Window

Quan Quan, Fenghe Tang, Zikang Xu, Heqin Zhu, S Kevin Zhou
Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, PMLR 250:1179-1195, 2024.

Abstract

The Segment Anything Model (SAM) has achieved a notable success in two-dimensional image segmentation in natural images. However, the substantial gap between medical and natural images hinders its direct application to medical image segmentation tasks. Particularly in 3D medical images, SAM struggles to learn contextual relationships between slices, limiting its practical applicability. Moreover, applying 2D SAM to 3D images requires prompting the entire volume, which is time- and label-consuming. To address these problems, we propose Slide-SAM, which treats a stack of three adjacent slices as a prediction window. It firstly takes three slices from a 3D volume and point- or bounding box prompts on the central slice as inputs to predict segmentation masks for all three slices. Subsequently, the masks of the top and bottom slices are then used to generate new prompts for adjacent slices. Finally, step-wise prediction can be achieved by sliding the prediction window forward or backward through the entire volume. Our model is trained on multiple public and private medical datasets and demonstrates its effectiveness through extensive 3D segmetnation experiments, with the help of minimal prompts. Code is available at https://github.com/Curli-quan/Slide-SAM.

Cite this Paper


BibTeX
@InProceedings{pmlr-v250-quan24a, title = {Slide-SAM: Medical SAM Meets Sliding Window}, author = {Quan, Quan and Tang, Fenghe and Xu, Zikang and Zhu, Heqin and Zhou, S Kevin}, booktitle = {Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning}, pages = {1179--1195}, year = {2024}, editor = {Burgos, Ninon and Petitjean, Caroline and Vakalopoulou, Maria and Christodoulidis, Stergios and Coupe, Pierrick and Delingette, Hervé and Lartizien, Carole and Mateus, Diana}, volume = {250}, series = {Proceedings of Machine Learning Research}, month = {03--05 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v250/main/assets/quan24a/quan24a.pdf}, url = {https://proceedings.mlr.press/v250/quan24a.html}, abstract = {The Segment Anything Model (SAM) has achieved a notable success in two-dimensional image segmentation in natural images. However, the substantial gap between medical and natural images hinders its direct application to medical image segmentation tasks. Particularly in 3D medical images, SAM struggles to learn contextual relationships between slices, limiting its practical applicability. Moreover, applying 2D SAM to 3D images requires prompting the entire volume, which is time- and label-consuming. To address these problems, we propose Slide-SAM, which treats a stack of three adjacent slices as a prediction window. It firstly takes three slices from a 3D volume and point- or bounding box prompts on the central slice as inputs to predict segmentation masks for all three slices. Subsequently, the masks of the top and bottom slices are then used to generate new prompts for adjacent slices. Finally, step-wise prediction can be achieved by sliding the prediction window forward or backward through the entire volume. Our model is trained on multiple public and private medical datasets and demonstrates its effectiveness through extensive 3D segmetnation experiments, with the help of minimal prompts. Code is available at https://github.com/Curli-quan/Slide-SAM.} }
Endnote
%0 Conference Paper %T Slide-SAM: Medical SAM Meets Sliding Window %A Quan Quan %A Fenghe Tang %A Zikang Xu %A Heqin Zhu %A S Kevin Zhou %B Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2024 %E Ninon Burgos %E Caroline Petitjean %E Maria Vakalopoulou %E Stergios Christodoulidis %E Pierrick Coupe %E Hervé Delingette %E Carole Lartizien %E Diana Mateus %F pmlr-v250-quan24a %I PMLR %P 1179--1195 %U https://proceedings.mlr.press/v250/quan24a.html %V 250 %X The Segment Anything Model (SAM) has achieved a notable success in two-dimensional image segmentation in natural images. However, the substantial gap between medical and natural images hinders its direct application to medical image segmentation tasks. Particularly in 3D medical images, SAM struggles to learn contextual relationships between slices, limiting its practical applicability. Moreover, applying 2D SAM to 3D images requires prompting the entire volume, which is time- and label-consuming. To address these problems, we propose Slide-SAM, which treats a stack of three adjacent slices as a prediction window. It firstly takes three slices from a 3D volume and point- or bounding box prompts on the central slice as inputs to predict segmentation masks for all three slices. Subsequently, the masks of the top and bottom slices are then used to generate new prompts for adjacent slices. Finally, step-wise prediction can be achieved by sliding the prediction window forward or backward through the entire volume. Our model is trained on multiple public and private medical datasets and demonstrates its effectiveness through extensive 3D segmetnation experiments, with the help of minimal prompts. Code is available at https://github.com/Curli-quan/Slide-SAM.
APA
Quan, Q., Tang, F., Xu, Z., Zhu, H. & Zhou, S.K.. (2024). Slide-SAM: Medical SAM Meets Sliding Window. Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 250:1179-1195 Available from https://proceedings.mlr.press/v250/quan24a.html.

Related Material