State-of-the-Art Text-Prompted Medical Segmentation Models Struggle to Ground Chest CT Findings

Mohammed Baharoon, Luyang Luo, Michael Moritz, Abhinav Kumar, Sung Eun Kim, Xiaoman Zhang, Miao Zhu, Kent Kleinschmidt, Sri Sai Dinesh Jaliparthi, Sathvik Suryadevara, Rithvik Akula, Mark Marino, Wenhui Lei, Ibrahim Ethem Hamamci, Pranav Rajpurkar
Proceedings of the 10th Machine Learning for Healthcare Conference, PMLR 298, 2025.

Abstract

This study presents a comprehensive evaluation of state-of-the-art text-prompted segmentation models, including SAM2, MedSAM2, SegVol, SAT, and BiomedParse, on ReXGrounding, a novel dataset that pairs chest CT findings with corresponding segmentation masks. Our results demonstrate that despite recent advances, current models struggle to accurately segment diverse findings from chest CTs, particularly when dealing with non-focal abnormalities described in natural language reports. While existing models are primarily optimized for fixed categorical labels rather than nuanced clinical descriptions, even fine-tuning these models with free-text descriptions yields limited improvement in segmentation accuracy. These insights highlight that report grounding on 3D medical volumes through segmentation remains an open challenge, necessitating future models that better comprehend complex clinical language and irregular object patterns across volumetric data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v298-baharoon25a, title = {State-of-the-Art Text-Prompted Medical Segmentation Models Struggle to Ground Chest {CT} Findings}, author = {Baharoon, Mohammed and Luo, Luyang and Moritz, Michael and Kumar, Abhinav and Kim, Sung Eun and Zhang, Xiaoman and Zhu, Miao and Kleinschmidt, Kent and Jaliparthi, Sri Sai Dinesh and Suryadevara, Sathvik and Akula, Rithvik and Marino, Mark and Lei, Wenhui and Hamamci, Ibrahim Ethem and Rajpurkar, Pranav}, booktitle = {Proceedings of the 10th Machine Learning for Healthcare Conference}, year = {2025}, editor = {Agrawal, Monica and Deshpande, Kaivalya and Engelhard, Matthew and Joshi, Shalmali and Tang, Shengpu and Urteaga, Iñigo}, volume = {298}, series = {Proceedings of Machine Learning Research}, month = {15--16 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v298/main/assets/baharoon25a/baharoon25a.pdf}, url = {https://proceedings.mlr.press/v298/baharoon25a.html}, abstract = {This study presents a comprehensive evaluation of state-of-the-art text-prompted segmentation models, including SAM2, MedSAM2, SegVol, SAT, and BiomedParse, on ReXGrounding, a novel dataset that pairs chest CT findings with corresponding segmentation masks. Our results demonstrate that despite recent advances, current models struggle to accurately segment diverse findings from chest CTs, particularly when dealing with non-focal abnormalities described in natural language reports. While existing models are primarily optimized for fixed categorical labels rather than nuanced clinical descriptions, even fine-tuning these models with free-text descriptions yields limited improvement in segmentation accuracy. These insights highlight that report grounding on 3D medical volumes through segmentation remains an open challenge, necessitating future models that better comprehend complex clinical language and irregular object patterns across volumetric data.} }
Endnote
%0 Conference Paper %T State-of-the-Art Text-Prompted Medical Segmentation Models Struggle to Ground Chest CT Findings %A Mohammed Baharoon %A Luyang Luo %A Michael Moritz %A Abhinav Kumar %A Sung Eun Kim %A Xiaoman Zhang %A Miao Zhu %A Kent Kleinschmidt %A Sri Sai Dinesh Jaliparthi %A Sathvik Suryadevara %A Rithvik Akula %A Mark Marino %A Wenhui Lei %A Ibrahim Ethem Hamamci %A Pranav Rajpurkar %B Proceedings of the 10th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2025 %E Monica Agrawal %E Kaivalya Deshpande %E Matthew Engelhard %E Shalmali Joshi %E Shengpu Tang %E Iñigo Urteaga %F pmlr-v298-baharoon25a %I PMLR %U https://proceedings.mlr.press/v298/baharoon25a.html %V 298 %X This study presents a comprehensive evaluation of state-of-the-art text-prompted segmentation models, including SAM2, MedSAM2, SegVol, SAT, and BiomedParse, on ReXGrounding, a novel dataset that pairs chest CT findings with corresponding segmentation masks. Our results demonstrate that despite recent advances, current models struggle to accurately segment diverse findings from chest CTs, particularly when dealing with non-focal abnormalities described in natural language reports. While existing models are primarily optimized for fixed categorical labels rather than nuanced clinical descriptions, even fine-tuning these models with free-text descriptions yields limited improvement in segmentation accuracy. These insights highlight that report grounding on 3D medical volumes through segmentation remains an open challenge, necessitating future models that better comprehend complex clinical language and irregular object patterns across volumetric data.
APA
Baharoon, M., Luo, L., Moritz, M., Kumar, A., Kim, S.E., Zhang, X., Zhu, M., Kleinschmidt, K., Jaliparthi, S.S.D., Suryadevara, S., Akula, R., Marino, M., Lei, W., Hamamci, I.E. & Rajpurkar, P.. (2025). State-of-the-Art Text-Prompted Medical Segmentation Models Struggle to Ground Chest CT Findings. Proceedings of the 10th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 298 Available from https://proceedings.mlr.press/v298/baharoon25a.html.

Related Material