Diffusion Model-Based Data Augmentation for Lung Ultrasound Classification with Limited Data

Xiaohui Zhang, Ahana Gangopadhyay, Hsi-Ming Chang, Ravi Soni
Proceedings of the 3rd Machine Learning for Health Symposium, PMLR 225:664-676, 2023.

Abstract

Deep learning models typically require large quantities of data for good generalization. However, acquiring labeled medical imaging data is expensive, particularly for rare pathologies. While standard data augmentation is routinely performed to improve data variety, it may not be sufficient to improve the performance of downstream tasks with a clinical diagnostic purpose. Here we investigate the applicability of SinDDM kulikov2023sinddm , a single-image denoising diffusion model, for medical image data augmentation with lung ultrasound (LUS) images. Qualitative and quantitative evaluation of perceptual quality of the generated images were conducted. A multi-class classification task to detect various pathologies from LUS images was also employed to demonstrate the effectiveness of synthetic data augmentation using SinDDM. We further evaluated the image generation performance of FewDDM, an extended version of SinDDM trained on a limited number of images instead of a single image. Our results show that both SinDDM and FewDDM are able to generate images superior in quality compared to single-image generative adversarial networks (GANs), and are also highly effective in augmenting medical imaging data with limited number of samples to improve downstream task performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v225-zhang23a, title = {Diffusion Model-Based Data Augmentation for Lung Ultrasound Classification with Limited Data}, author = {Zhang, Xiaohui and Gangopadhyay, Ahana and Chang, Hsi-Ming and Soni, Ravi}, booktitle = {Proceedings of the 3rd Machine Learning for Health Symposium}, pages = {664--676}, year = {2023}, editor = {Hegselmann, Stefan and Parziale, Antonio and Shanmugam, Divya and Tang, Shengpu and Asiedu, Mercy Nyamewaa and Chang, Serina and Hartvigsen, Tom and Singh, Harvineet}, volume = {225}, series = {Proceedings of Machine Learning Research}, month = {10 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v225/zhang23a/zhang23a.pdf}, url = {https://proceedings.mlr.press/v225/zhang23a.html}, abstract = {Deep learning models typically require large quantities of data for good generalization. However, acquiring labeled medical imaging data is expensive, particularly for rare pathologies. While standard data augmentation is routinely performed to improve data variety, it may not be sufficient to improve the performance of downstream tasks with a clinical diagnostic purpose. Here we investigate the applicability of SinDDM kulikov2023sinddm , a single-image denoising diffusion model, for medical image data augmentation with lung ultrasound (LUS) images. Qualitative and quantitative evaluation of perceptual quality of the generated images were conducted. A multi-class classification task to detect various pathologies from LUS images was also employed to demonstrate the effectiveness of synthetic data augmentation using SinDDM. We further evaluated the image generation performance of FewDDM, an extended version of SinDDM trained on a limited number of images instead of a single image. Our results show that both SinDDM and FewDDM are able to generate images superior in quality compared to single-image generative adversarial networks (GANs), and are also highly effective in augmenting medical imaging data with limited number of samples to improve downstream task performance.} }
Endnote
%0 Conference Paper %T Diffusion Model-Based Data Augmentation for Lung Ultrasound Classification with Limited Data %A Xiaohui Zhang %A Ahana Gangopadhyay %A Hsi-Ming Chang %A Ravi Soni %B Proceedings of the 3rd Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2023 %E Stefan Hegselmann %E Antonio Parziale %E Divya Shanmugam %E Shengpu Tang %E Mercy Nyamewaa Asiedu %E Serina Chang %E Tom Hartvigsen %E Harvineet Singh %F pmlr-v225-zhang23a %I PMLR %P 664--676 %U https://proceedings.mlr.press/v225/zhang23a.html %V 225 %X Deep learning models typically require large quantities of data for good generalization. However, acquiring labeled medical imaging data is expensive, particularly for rare pathologies. While standard data augmentation is routinely performed to improve data variety, it may not be sufficient to improve the performance of downstream tasks with a clinical diagnostic purpose. Here we investigate the applicability of SinDDM kulikov2023sinddm , a single-image denoising diffusion model, for medical image data augmentation with lung ultrasound (LUS) images. Qualitative and quantitative evaluation of perceptual quality of the generated images were conducted. A multi-class classification task to detect various pathologies from LUS images was also employed to demonstrate the effectiveness of synthetic data augmentation using SinDDM. We further evaluated the image generation performance of FewDDM, an extended version of SinDDM trained on a limited number of images instead of a single image. Our results show that both SinDDM and FewDDM are able to generate images superior in quality compared to single-image generative adversarial networks (GANs), and are also highly effective in augmenting medical imaging data with limited number of samples to improve downstream task performance.
APA
Zhang, X., Gangopadhyay, A., Chang, H. & Soni, R.. (2023). Diffusion Model-Based Data Augmentation for Lung Ultrasound Classification with Limited Data. Proceedings of the 3rd Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 225:664-676 Available from https://proceedings.mlr.press/v225/zhang23a.html.

Related Material