Towards the Practical Utility of Federated Learning in the Medical Domain

Hyeonji Hwang, Seongjun Yang, Daeyoung Kim, Radhika Dua, Jong-Yeup Kim, Eunho Yang, Edward Choi
Proceedings of the Conference on Health, Inference, and Learning, PMLR 209:163-181, 2023.

Abstract

Federated learning (FL) is an active area of research. One of the most suitable areas for adopting FL is the medical domain, where patient privacy must be respected. Previous research, however, does not provide a practical guide to applying FL in the medical domain. We propose empirical benchmarks and experimental settings for three representative medical datasets with different modalities: longitudinal electronic health records, skin cancer images, and electrocardiogram signals. The likely users of FL such as medical institutions and IT companies can take these benchmarks as guides for adopting FL and minimize their trial and error. For each dataset, each client data is from a different source to preserve real-world heterogeneity. We evaluate six FL algorithms designed for addressing data heterogeneity among clients, and a hybrid algorithm combining the strengths of two representative FL algorithms. Based on experiment results from three modalities, we discover that simple FL algorithms tend to outperform more sophisticated ones, while the hybrid algorithm consistently shows good, if not the best performance. We also find that a frequent global model update leads to better performance under a fixed training iteration budget. As the number of participating clients increases, higher cost is incurred due to increased IT administrators and GPUs, but the performance consistently increases. We expect future users will refer to these empirical benchmarks to design the FL experiments in the medical domain considering their clinical tasks and obtain stronger performance with lower costs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v209-hwang23a, title = {Towards the Practical Utility of Federated Learning in the Medical Domain}, author = {Hwang, Hyeonji and Yang, Seongjun and Kim, Daeyoung and Dua, Radhika and Kim, Jong-Yeup and Yang, Eunho and Choi, Edward}, booktitle = {Proceedings of the Conference on Health, Inference, and Learning}, pages = {163--181}, year = {2023}, editor = {Mortazavi, Bobak J. and Sarker, Tasmie and Beam, Andrew and Ho, Joyce C.}, volume = {209}, series = {Proceedings of Machine Learning Research}, month = {22 Jun--24 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v209/hwang23a/hwang23a.pdf}, url = {https://proceedings.mlr.press/v209/hwang23a.html}, abstract = {Federated learning (FL) is an active area of research. One of the most suitable areas for adopting FL is the medical domain, where patient privacy must be respected. Previous research, however, does not provide a practical guide to applying FL in the medical domain. We propose empirical benchmarks and experimental settings for three representative medical datasets with different modalities: longitudinal electronic health records, skin cancer images, and electrocardiogram signals. The likely users of FL such as medical institutions and IT companies can take these benchmarks as guides for adopting FL and minimize their trial and error. For each dataset, each client data is from a different source to preserve real-world heterogeneity. We evaluate six FL algorithms designed for addressing data heterogeneity among clients, and a hybrid algorithm combining the strengths of two representative FL algorithms. Based on experiment results from three modalities, we discover that simple FL algorithms tend to outperform more sophisticated ones, while the hybrid algorithm consistently shows good, if not the best performance. We also find that a frequent global model update leads to better performance under a fixed training iteration budget. As the number of participating clients increases, higher cost is incurred due to increased IT administrators and GPUs, but the performance consistently increases. We expect future users will refer to these empirical benchmarks to design the FL experiments in the medical domain considering their clinical tasks and obtain stronger performance with lower costs.} }
Endnote
%0 Conference Paper %T Towards the Practical Utility of Federated Learning in the Medical Domain %A Hyeonji Hwang %A Seongjun Yang %A Daeyoung Kim %A Radhika Dua %A Jong-Yeup Kim %A Eunho Yang %A Edward Choi %B Proceedings of the Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2023 %E Bobak J. Mortazavi %E Tasmie Sarker %E Andrew Beam %E Joyce C. Ho %F pmlr-v209-hwang23a %I PMLR %P 163--181 %U https://proceedings.mlr.press/v209/hwang23a.html %V 209 %X Federated learning (FL) is an active area of research. One of the most suitable areas for adopting FL is the medical domain, where patient privacy must be respected. Previous research, however, does not provide a practical guide to applying FL in the medical domain. We propose empirical benchmarks and experimental settings for three representative medical datasets with different modalities: longitudinal electronic health records, skin cancer images, and electrocardiogram signals. The likely users of FL such as medical institutions and IT companies can take these benchmarks as guides for adopting FL and minimize their trial and error. For each dataset, each client data is from a different source to preserve real-world heterogeneity. We evaluate six FL algorithms designed for addressing data heterogeneity among clients, and a hybrid algorithm combining the strengths of two representative FL algorithms. Based on experiment results from three modalities, we discover that simple FL algorithms tend to outperform more sophisticated ones, while the hybrid algorithm consistently shows good, if not the best performance. We also find that a frequent global model update leads to better performance under a fixed training iteration budget. As the number of participating clients increases, higher cost is incurred due to increased IT administrators and GPUs, but the performance consistently increases. We expect future users will refer to these empirical benchmarks to design the FL experiments in the medical domain considering their clinical tasks and obtain stronger performance with lower costs.
APA
Hwang, H., Yang, S., Kim, D., Dua, R., Kim, J., Yang, E. & Choi, E.. (2023). Towards the Practical Utility of Federated Learning in the Medical Domain. Proceedings of the Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 209:163-181 Available from https://proceedings.mlr.press/v209/hwang23a.html.

Related Material