[edit]
State-of-the-Art Text-Prompted Medical Segmentation Models Struggle to Ground Chest CT Findings
Proceedings of the 10th Machine Learning for Healthcare Conference, PMLR 298, 2025.
Abstract
This study presents a comprehensive evaluation of state-of-the-art text-prompted segmentation models, including SAM2, MedSAM2, SegVol, SAT, and BiomedParse, on ReXGrounding, a novel dataset that pairs chest CT findings with corresponding segmentation masks. Our results demonstrate that despite recent advances, current models struggle to accurately segment diverse findings from chest CTs, particularly when dealing with non-focal abnormalities described in natural language reports. While existing models are primarily optimized for fixed categorical labels rather than nuanced clinical descriptions, even fine-tuning these models with free-text descriptions yields limited improvement in segmentation accuracy. These insights highlight that report grounding on 3D medical volumes through segmentation remains an open challenge, necessitating future models that better comprehend complex clinical language and irregular object patterns across volumetric data.