[edit]
Guideline-Informed MLLM Reasoning for Pathology-Aware Postoperative Prostate CTV Segmentation
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:1004-1028, 2026.
Abstract
Accurate segmentation of the Clinical Target Volume (CTV) is a critical prerequisite for precise radiotherapy planning, pursuing complete irradiation of microscopic disease while minimizing toxicity to surrounding healthy organs. However, achieving automated CTV segmentation remains highly challenging due to the invisible microscopic disease on planning CT and the necessity of incorporating clinical context into delineation decisions. Unlike previous methods that rely solely on visual features or coarse global text reasoning, we propose ReaCT, a unified framework that reformulates CTV segmentation as a multimodal reasoning task by explicitly integrating pathological information with visual context. Specifically, we introduce a Guideline-Informed Attribute Extractor that follows the information-retrieval workflow of radiation oncologists. By distilling knowledge from clinical guidelines, this module filters and structures lengthy pathology reports into a concise set of clinically determinative pathological attributes, effectively bridging the semantic gap between unstructured clinical records and segmentation networks. Furthermore, we develop an Attribute-Specific MLLM Reasoner built upon a 3D residual U-Net that performs fine-grained spatial reasoning. By leveraging a sequence of attribute-specific query tokens, the model disentangles the distinct target implications of individual pathological attributes, enabling fine-grained anatomical alignment via multi-scale fusion using Two-Way Transformers. Experiments on a postoperative prostate cancer dataset demonstrate that ReaCT achieves state-of-the-art segmentation performance and exhibits strong robustness, with pronounced improvements under limited-annotation settings.