Resolution and Field of View Invariant Generative Modelling with Latent Diffusion Models

Ashay Patel, Mark S Graham, Vicky Goh, Sebastien Ourselin, M. Jorge Cardoso
Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, PMLR 250:1086-1097, 2024.

Abstract

Large dataset requirements for deep learning methods can pose a challenge in the medical field, where datasets tend to be relatively small. Synthetic data can provide a suitable solution to this problem, when complemented with real data. However current generative methods normally require all data to be of the same resolution and, ideally, aligned to an atlas. This not only creates more stringent restrictions on the training data but also limits what data can be used for inference. To overcome this our work proposes a latent diffusion model that is able to control sample geometries by varying their resolution, field of view, and orientation. We demonstrate this work on whole body CT data, using a spatial conditioning mechanism. We showcase how our model provides samples as good as an ordinary latent diffusion model trained fully on whole body single resolution data. This is in addition to the benefit of further control over resolution, field of view, orientation, and even the emergent behaviour of super-resolution. We found that our model could create realistic images across the varying tasks showcasing the potential of this application.

Cite this Paper


BibTeX
@InProceedings{pmlr-v250-patel24a, title = {Resolution and Field of View Invariant Generative Modelling with Latent Diffusion Models}, author = {Patel, Ashay and Graham, Mark S and Goh, Vicky and Ourselin, Sebastien and Cardoso, M. Jorge}, booktitle = {Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning}, pages = {1086--1097}, year = {2024}, editor = {Burgos, Ninon and Petitjean, Caroline and Vakalopoulou, Maria and Christodoulidis, Stergios and Coupe, Pierrick and Delingette, Hervé and Lartizien, Carole and Mateus, Diana}, volume = {250}, series = {Proceedings of Machine Learning Research}, month = {03--05 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v250/main/assets/patel24a/patel24a.pdf}, url = {https://proceedings.mlr.press/v250/patel24a.html}, abstract = {Large dataset requirements for deep learning methods can pose a challenge in the medical field, where datasets tend to be relatively small. Synthetic data can provide a suitable solution to this problem, when complemented with real data. However current generative methods normally require all data to be of the same resolution and, ideally, aligned to an atlas. This not only creates more stringent restrictions on the training data but also limits what data can be used for inference. To overcome this our work proposes a latent diffusion model that is able to control sample geometries by varying their resolution, field of view, and orientation. We demonstrate this work on whole body CT data, using a spatial conditioning mechanism. We showcase how our model provides samples as good as an ordinary latent diffusion model trained fully on whole body single resolution data. This is in addition to the benefit of further control over resolution, field of view, orientation, and even the emergent behaviour of super-resolution. We found that our model could create realistic images across the varying tasks showcasing the potential of this application.} }
Endnote
%0 Conference Paper %T Resolution and Field of View Invariant Generative Modelling with Latent Diffusion Models %A Ashay Patel %A Mark S Graham %A Vicky Goh %A Sebastien Ourselin %A M. Jorge Cardoso %B Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2024 %E Ninon Burgos %E Caroline Petitjean %E Maria Vakalopoulou %E Stergios Christodoulidis %E Pierrick Coupe %E Hervé Delingette %E Carole Lartizien %E Diana Mateus %F pmlr-v250-patel24a %I PMLR %P 1086--1097 %U https://proceedings.mlr.press/v250/patel24a.html %V 250 %X Large dataset requirements for deep learning methods can pose a challenge in the medical field, where datasets tend to be relatively small. Synthetic data can provide a suitable solution to this problem, when complemented with real data. However current generative methods normally require all data to be of the same resolution and, ideally, aligned to an atlas. This not only creates more stringent restrictions on the training data but also limits what data can be used for inference. To overcome this our work proposes a latent diffusion model that is able to control sample geometries by varying their resolution, field of view, and orientation. We demonstrate this work on whole body CT data, using a spatial conditioning mechanism. We showcase how our model provides samples as good as an ordinary latent diffusion model trained fully on whole body single resolution data. This is in addition to the benefit of further control over resolution, field of view, orientation, and even the emergent behaviour of super-resolution. We found that our model could create realistic images across the varying tasks showcasing the potential of this application.
APA
Patel, A., Graham, M.S., Goh, V., Ourselin, S. & Cardoso, M.J.. (2024). Resolution and Field of View Invariant Generative Modelling with Latent Diffusion Models. Proceedings of The 7nd International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 250:1086-1097 Available from https://proceedings.mlr.press/v250/patel24a.html.

Related Material