Vision-Language Generative Model for View-Specific Chest X-ray Generation

Hyungyung Lee, Da Young Lee, Wonjae Kim, Jin-Hwa Kim, Tackeun Kim, Jihang Kim, Leonard Sunwoo, Edward Choi
Proceedings of the fifth Conference on Health, Inference, and Learning, PMLR 248:280-296, 2024.

Abstract

Synthetic medical data generation has opened up new possibilities in the healthcare domain, offering a powerful tool for simulating clinical scenarios, enhancing diagnostic and treatment quality, gaining granular medical knowledge, and accelerating the development of unbiased algorithms. In this context, we present a novel approach called ViewXGen, designed to overcome the limitations of existing methods that rely on general domain pipelines using only radiology reports to generate frontal-view chest X-rays. Our approach takes into consideration the diverse view positions found in the dataset, enabling the generation of chest X-rays with specific views, which marks a significant advancement in the field. To achieve this, we introduce a set of specially designed tokens for each view position, tailoring the generation process to the user’s preferences. Furthermore, we leverage multi-view chest X-rays as input, incorporating valuable information from different views within the same study. This integration rectifies potential errors and contributes to faithfully capturing abnormal findings in chest X-ray generation. To validate the effectiveness of our approach, we conducted statistical analyses, evaluating its performance in a clinical efficacy metric on the MIMIC-CXR dataset. Also, human evaluation demonstrates the remarkable capabilities of ViewXGen, particularly in producing realistic view-specific X-rays that closely resemble the original images.

Cite this Paper


BibTeX
@InProceedings{pmlr-v248-lee24a, title = {Vision-Language Generative Model for View-Specific Chest X-ray Generation}, author = {Lee, Hyungyung and Lee, Da Young and Kim, Wonjae and Kim, Jin-Hwa and Kim, Tackeun and Kim, Jihang and Sunwoo, Leonard and Choi, Edward}, booktitle = {Proceedings of the fifth Conference on Health, Inference, and Learning}, pages = {280--296}, year = {2024}, editor = {Pollard, Tom and Choi, Edward and Singhal, Pankhuri and Hughes, Michael and Sizikova, Elena and Mortazavi, Bobak and Chen, Irene and Wang, Fei and Sarker, Tasmie and McDermott, Matthew and Ghassemi, Marzyeh}, volume = {248}, series = {Proceedings of Machine Learning Research}, month = {27--28 Jun}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v248/main/assets/lee24a/lee24a.pdf}, url = {https://proceedings.mlr.press/v248/lee24a.html}, abstract = {Synthetic medical data generation has opened up new possibilities in the healthcare domain, offering a powerful tool for simulating clinical scenarios, enhancing diagnostic and treatment quality, gaining granular medical knowledge, and accelerating the development of unbiased algorithms. In this context, we present a novel approach called ViewXGen, designed to overcome the limitations of existing methods that rely on general domain pipelines using only radiology reports to generate frontal-view chest X-rays. Our approach takes into consideration the diverse view positions found in the dataset, enabling the generation of chest X-rays with specific views, which marks a significant advancement in the field. To achieve this, we introduce a set of specially designed tokens for each view position, tailoring the generation process to the user’s preferences. Furthermore, we leverage multi-view chest X-rays as input, incorporating valuable information from different views within the same study. This integration rectifies potential errors and contributes to faithfully capturing abnormal findings in chest X-ray generation. To validate the effectiveness of our approach, we conducted statistical analyses, evaluating its performance in a clinical efficacy metric on the MIMIC-CXR dataset. Also, human evaluation demonstrates the remarkable capabilities of ViewXGen, particularly in producing realistic view-specific X-rays that closely resemble the original images.} }
Endnote
%0 Conference Paper %T Vision-Language Generative Model for View-Specific Chest X-ray Generation %A Hyungyung Lee %A Da Young Lee %A Wonjae Kim %A Jin-Hwa Kim %A Tackeun Kim %A Jihang Kim %A Leonard Sunwoo %A Edward Choi %B Proceedings of the fifth Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2024 %E Tom Pollard %E Edward Choi %E Pankhuri Singhal %E Michael Hughes %E Elena Sizikova %E Bobak Mortazavi %E Irene Chen %E Fei Wang %E Tasmie Sarker %E Matthew McDermott %E Marzyeh Ghassemi %F pmlr-v248-lee24a %I PMLR %P 280--296 %U https://proceedings.mlr.press/v248/lee24a.html %V 248 %X Synthetic medical data generation has opened up new possibilities in the healthcare domain, offering a powerful tool for simulating clinical scenarios, enhancing diagnostic and treatment quality, gaining granular medical knowledge, and accelerating the development of unbiased algorithms. In this context, we present a novel approach called ViewXGen, designed to overcome the limitations of existing methods that rely on general domain pipelines using only radiology reports to generate frontal-view chest X-rays. Our approach takes into consideration the diverse view positions found in the dataset, enabling the generation of chest X-rays with specific views, which marks a significant advancement in the field. To achieve this, we introduce a set of specially designed tokens for each view position, tailoring the generation process to the user’s preferences. Furthermore, we leverage multi-view chest X-rays as input, incorporating valuable information from different views within the same study. This integration rectifies potential errors and contributes to faithfully capturing abnormal findings in chest X-ray generation. To validate the effectiveness of our approach, we conducted statistical analyses, evaluating its performance in a clinical efficacy metric on the MIMIC-CXR dataset. Also, human evaluation demonstrates the remarkable capabilities of ViewXGen, particularly in producing realistic view-specific X-rays that closely resemble the original images.
APA
Lee, H., Lee, D.Y., Kim, W., Kim, J., Kim, T., Kim, J., Sunwoo, L. & Choi, E.. (2024). Vision-Language Generative Model for View-Specific Chest X-ray Generation. Proceedings of the fifth Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 248:280-296 Available from https://proceedings.mlr.press/v248/lee24a.html.

Related Material