Virtual-Eyes: Quantitative Validation of a Lung CT Quality-Control Pipeline for Foundation-Model Cancer Risk Prediction

Md. Enamul Hoq; Linda Larson-Prior; Fred Prior

Virtual-Eyes: Quantitative Validation of a Lung CT Quality-Control Pipeline for Foundation-Model Cancer Risk Prediction

Md. Enamul Hoq, Linda Larson-Prior, Fred Prior

Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:4639-4663, 2026.

Abstract

Robust preprocessing is rarely quantified in deep-learning pipelines for low-dose CT (LDCT) lung cancer screening. We develop and validate Virtual-Eyes, a clinically motivated, 16-bit CT quality-control pipeline for NLST, and measure its differential impact on generalist foundation models versus specialist models. Virtual-Eyes enforces strict 512 $\times$ 512 resolution, rejects short or non-diagnostic series, and extracts a contiguous lung block using Hounsfield-unit filtering and bilateral lung-coverage scoring while preserving the original 16-bit DICOM grid. Using 765 NLST patients (182 cancer, 583 non-cancer), we evaluate RAD-DINO, Merlin, Sybil, and ResNet-18 under a leakage-free protocol. For RAD-DINO, preprocessing improves slice-level AUC from 0.576 to 0.610 and patient-level AUC from 0.646 to 0.683 (mean pooling) and 0.619 to 0.735 (max pooling), with improved calibration (Brier score 0.188 $\rightarrow$ 0.112). In contrast, Sybil and ResNet-18 degrade under Virtual-Eyes, revealing reliance on contextual or shortcut features, while Merlin shows limited transferability. Sensitivity analysis and uncertainty estimation confirm the robustness and stability of these findings.

Cite this Paper

BibTeX

@InProceedings{pmlr-v315-hoq26a,
  title = 	 {Virtual-Eyes: Quantitative Validation of a Lung CT Quality-Control Pipeline for Foundation-Model Cancer Risk Prediction},
  author =       {Hoq, Md. Enamul and Larson-Prior, Linda and Prior, Fred},
  booktitle = 	 {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning},
  pages = 	 {4639--4663},
  year = 	 {2026},
  editor = 	 {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining},
  volume = 	 {315},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {08--10 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v315/main/assets/hoq26a/hoq26a.pdf},
  url = 	 {https://proceedings.mlr.press/v315/hoq26a.html},
  abstract = 	 {Robust preprocessing is rarely quantified in deep-learning pipelines for low-dose CT (LDCT) lung cancer screening. We develop and validate Virtual-Eyes, a clinically motivated, 16-bit CT quality-control pipeline for NLST, and measure its differential impact on generalist foundation models versus specialist models. Virtual-Eyes enforces strict 512 $\times$ 512 resolution, rejects short or non-diagnostic series, and extracts a contiguous lung block using Hounsfield-unit filtering and bilateral lung-coverage scoring while preserving the original 16-bit DICOM grid. Using 765 NLST patients (182 cancer, 583 non-cancer), we evaluate RAD-DINO, Merlin, Sybil, and ResNet-18 under a leakage-free protocol. For RAD-DINO, preprocessing improves slice-level AUC from 0.576 to 0.610 and patient-level AUC from 0.646 to 0.683 (mean pooling) and 0.619 to 0.735 (max pooling), with improved calibration (Brier score 0.188 $\rightarrow$ 0.112). In contrast, Sybil and ResNet-18 degrade under Virtual-Eyes, revealing reliance on contextual or shortcut features, while Merlin shows limited transferability. Sensitivity analysis and uncertainty estimation confirm the robustness and stability of these findings.}
}

Endnote

%0 Conference Paper
%T Virtual-Eyes: Quantitative Validation of a Lung CT Quality-Control Pipeline for Foundation-Model Cancer Risk Prediction
%A Md. Enamul Hoq
%A Linda Larson-Prior
%A Fred Prior
%B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2026
%E Yuankai Huo
%E Mingchen Gao
%E Chang-Fu Kuo
%E Yueming Jin
%E Ruining Deng	
%F pmlr-v315-hoq26a
%I PMLR
%P 4639--4663
%U https://proceedings.mlr.press/v315/hoq26a.html
%V 315
%X Robust preprocessing is rarely quantified in deep-learning pipelines for low-dose CT (LDCT) lung cancer screening. We develop and validate Virtual-Eyes, a clinically motivated, 16-bit CT quality-control pipeline for NLST, and measure its differential impact on generalist foundation models versus specialist models. Virtual-Eyes enforces strict 512 $\times$ 512 resolution, rejects short or non-diagnostic series, and extracts a contiguous lung block using Hounsfield-unit filtering and bilateral lung-coverage scoring while preserving the original 16-bit DICOM grid. Using 765 NLST patients (182 cancer, 583 non-cancer), we evaluate RAD-DINO, Merlin, Sybil, and ResNet-18 under a leakage-free protocol. For RAD-DINO, preprocessing improves slice-level AUC from 0.576 to 0.610 and patient-level AUC from 0.646 to 0.683 (mean pooling) and 0.619 to 0.735 (max pooling), with improved calibration (Brier score 0.188 $\rightarrow$ 0.112). In contrast, Sybil and ResNet-18 degrade under Virtual-Eyes, revealing reliance on contextual or shortcut features, while Merlin shows limited transferability. Sensitivity analysis and uncertainty estimation confirm the robustness and stability of these findings.

APA

Hoq, M.E., Larson-Prior, L. & Prior, F.. (2026). Virtual-Eyes: Quantitative Validation of a Lung CT Quality-Control Pipeline for Foundation-Model Cancer Risk Prediction. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:4639-4663 Available from https://proceedings.mlr.press/v315/hoq26a.html.

Related Material

Download PDF