Scaling Supervision for Free: Leveraging Universal Segmentation Models for Enhanced Medical Image Diagnosis

Yingtai Li, Shuai Ming, Haoran Lai, Fenghe Tang, Wei Wei, S Kevin Zhou
Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:1021-1040, 2026.

Abstract

Deep learning-based medical image analysis has been constrained by the limited availability of large-scale annotated data. While recent advances in large language models have enabled scaling automatic extraction of diagnostic labels from reports, we propose that scaling other form of supervision could be an equally important yet unexplored direction. Inspired by the success of foundation models, we leverage modern universal segmentation model to scale anatomical segmentation as an additional supervision signal during training. Through extensive experiments on three large-scale CT datasets totaling 58K+ volumes, we demonstrate that incorporating this free\"{anatomical} supervision consistently improves the performance of various mainstream architectures (ResNet, ViT, and Swin Transformer) by up to 12.74%, with particularly significant gains for Transformer-based models and anatomically-localized abnormalities, while maintaining inference efficiency as the segmentation branch is only used during training. This work opens up new direction for scaling in medical imaging and demonstrates how existing universal segmentation models can be repurposed to enhance diagnostic models at virtually no additional cost.

Cite this Paper


BibTeX
@InProceedings{pmlr-v301-li26d, title = {Scaling Supervision for Free: Leveraging Universal Segmentation Models for Enhanced Medical Image Diagnosis}, author = {Li, Yingtai and Ming, Shuai and Lai, Haoran and Tang, Fenghe and Wei, Wei and Zhou, S Kevin}, booktitle = {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning}, pages = {1021--1040}, year = {2026}, editor = {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan}, volume = {301}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v301/main/assets/li26d/li26d.pdf}, url = {https://proceedings.mlr.press/v301/li26d.html}, abstract = {Deep learning-based medical image analysis has been constrained by the limited availability of large-scale annotated data. While recent advances in large language models have enabled scaling automatic extraction of diagnostic labels from reports, we propose that scaling other form of supervision could be an equally important yet unexplored direction. Inspired by the success of foundation models, we leverage modern universal segmentation model to scale anatomical segmentation as an additional supervision signal during training. Through extensive experiments on three large-scale CT datasets totaling 58K+ volumes, we demonstrate that incorporating this free\"{anatomical} supervision consistently improves the performance of various mainstream architectures (ResNet, ViT, and Swin Transformer) by up to 12.74%, with particularly significant gains for Transformer-based models and anatomically-localized abnormalities, while maintaining inference efficiency as the segmentation branch is only used during training. This work opens up new direction for scaling in medical imaging and demonstrates how existing universal segmentation models can be repurposed to enhance diagnostic models at virtually no additional cost.} }
Endnote
%0 Conference Paper %T Scaling Supervision for Free: Leveraging Universal Segmentation Models for Enhanced Medical Image Diagnosis %A Yingtai Li %A Shuai Ming %A Haoran Lai %A Fenghe Tang %A Wei Wei %A S Kevin Zhou %B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Tolga Tasdizen %E Shireen Elhabian %E Ronald Summers %E Chen Chen %E Lisa Koch %E Yan Zhuang %F pmlr-v301-li26d %I PMLR %P 1021--1040 %U https://proceedings.mlr.press/v301/li26d.html %V 301 %X Deep learning-based medical image analysis has been constrained by the limited availability of large-scale annotated data. While recent advances in large language models have enabled scaling automatic extraction of diagnostic labels from reports, we propose that scaling other form of supervision could be an equally important yet unexplored direction. Inspired by the success of foundation models, we leverage modern universal segmentation model to scale anatomical segmentation as an additional supervision signal during training. Through extensive experiments on three large-scale CT datasets totaling 58K+ volumes, we demonstrate that incorporating this free\"{anatomical} supervision consistently improves the performance of various mainstream architectures (ResNet, ViT, and Swin Transformer) by up to 12.74%, with particularly significant gains for Transformer-based models and anatomically-localized abnormalities, while maintaining inference efficiency as the segmentation branch is only used during training. This work opens up new direction for scaling in medical imaging and demonstrates how existing universal segmentation models can be repurposed to enhance diagnostic models at virtually no additional cost.
APA
Li, Y., Ming, S., Lai, H., Tang, F., Wei, W. & Zhou, S.K.. (2026). Scaling Supervision for Free: Leveraging Universal Segmentation Models for Enhanced Medical Image Diagnosis. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:1021-1040 Available from https://proceedings.mlr.press/v301/li26d.html.

Related Material