4D-VQ-GAN: A World Model for Synthesizing Medical Scans at Any Time Point for Personalized Disease Progression Modeling of Idiopathic Pulmonary Fibrosis

An Zhao, Moucheng Xu, Ahmed H. Shahin, Wim Wuyts, Mark G. Jones, Joseph Jacob, Daniel C. Alexander
Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, PMLR 301:1879-1909, 2026.

Abstract

Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. As inspired by the recent success of world models in generating video-based virtual physical worlds, we present the first world model for IPF, to synthesize realistic scans of early-stage IPF patients at any time point. We term our model 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN). Our model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) model is trained to capture the temporal dynamics of the quantised embeddings, which are generated by the encoder trained in the first stage. For clinical validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index either better than or comparable to that of biomarkers derived from the real CT scans. The survival analysis results suggest the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes. The code is publicly available at https://github.com/anzhao920/4DVQGAN.

Cite this Paper


BibTeX
@InProceedings{pmlr-v301-zhao26a, title = {4D-VQ-GAN: A World Model for Synthesizing Medical Scans at Any Time Point for Personalized Disease Progression Modeling of Idiopathic Pulmonary Fibrosis}, author = {Zhao, An and Xu, Moucheng and Shahin, Ahmed H. and Wuyts, Wim and Jones, Mark G. and Jacob, Joseph and Alexander, Daniel C.}, booktitle = {Proceedings of The 8th International Conference on Medical Imaging with Deep Learning}, pages = {1879--1909}, year = {2026}, editor = {Tasdizen, Tolga and Elhabian, Shireen and Summers, Ronald and Chen, Chen and Koch, Lisa and Zhuang, Yan}, volume = {301}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v301/main/assets/zhao26a/zhao26a.pdf}, url = {https://proceedings.mlr.press/v301/zhao26a.html}, abstract = {Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. As inspired by the recent success of world models in generating video-based virtual physical worlds, we present the first world model for IPF, to synthesize realistic scans of early-stage IPF patients at any time point. We term our model 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN). Our model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) model is trained to capture the temporal dynamics of the quantised embeddings, which are generated by the encoder trained in the first stage. For clinical validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index either better than or comparable to that of biomarkers derived from the real CT scans. The survival analysis results suggest the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes. The code is publicly available at https://github.com/anzhao920/4DVQGAN.} }
Endnote
%0 Conference Paper %T 4D-VQ-GAN: A World Model for Synthesizing Medical Scans at Any Time Point for Personalized Disease Progression Modeling of Idiopathic Pulmonary Fibrosis %A An Zhao %A Moucheng Xu %A Ahmed H. Shahin %A Wim Wuyts %A Mark G. Jones %A Joseph Jacob %A Daniel C. Alexander %B Proceedings of The 8th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Tolga Tasdizen %E Shireen Elhabian %E Ronald Summers %E Chen Chen %E Lisa Koch %E Yan Zhuang %F pmlr-v301-zhao26a %I PMLR %P 1879--1909 %U https://proceedings.mlr.press/v301/zhao26a.html %V 301 %X Understanding the progression trajectories of diseases is crucial for early diagnosis and effective treatment planning. This is especially vital for life-threatening conditions such as Idiopathic Pulmonary Fibrosis (IPF), a chronic, progressive lung disease with a prognosis comparable to many cancers. Computed tomography (CT) imaging has been established as a reliable diagnostic tool for IPF. Accurately predicting future CT scans of early-stage IPF patients can aid in developing better treatment strategies, thereby improving survival outcomes. As inspired by the recent success of world models in generating video-based virtual physical worlds, we present the first world model for IPF, to synthesize realistic scans of early-stage IPF patients at any time point. We term our model 4D Vector Quantised Generative Adversarial Networks (4D-VQ-GAN). Our model is trained using a two-stage approach. In the first stage, a 3D-VQ-GAN is trained to reconstruct CT volumes. In the second stage, a Neural Ordinary Differential Equation (ODE) model is trained to capture the temporal dynamics of the quantised embeddings, which are generated by the encoder trained in the first stage. For clinical validation, we conduct survival analysis using imaging biomarkers derived from generated CT scans and achieve a C-index either better than or comparable to that of biomarkers derived from the real CT scans. The survival analysis results suggest the potential clinical utility inherent to generated longitudinal CT scans, showing that they can reliably predict survival outcomes. The code is publicly available at https://github.com/anzhao920/4DVQGAN.
APA
Zhao, A., Xu, M., Shahin, A.H., Wuyts, W., Jones, M.G., Jacob, J. & Alexander, D.C.. (2026). 4D-VQ-GAN: A World Model for Synthesizing Medical Scans at Any Time Point for Personalized Disease Progression Modeling of Idiopathic Pulmonary Fibrosis. Proceedings of The 8th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 301:1879-1909 Available from https://proceedings.mlr.press/v301/zhao26a.html.

Related Material