Practical Challenges in Differentially-Private Federated Survival Analysis of Medical Data

Shadi Rahimian, Raouf Kerkouche, Ina Kurth, Mario Fritz
Proceedings of the Conference on Health, Inference, and Learning, PMLR 174:411-425, 2022.

Abstract

Survival analysis or time-to-event analysis aims to model and predict the time it takes for an event of interest to happen in a population or an individual. In the medical context this event might be the time of dying, metastasis, recurrence of cancer, etc. Recently, the use of neural networks that are specifically designed for survival analysis has become more popular and an attractive alternative to more traditional methods. In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of these models. This is crucial in the medical domain since data is scarce and collaboration of multiple health centers is essential to make a conclusive decision about the properties of a treatment or a disease. To ensure the privacy of the datasets, it is common to utilize differential privacy on top of federated learning. Differential privacy acts by introducing random noise to different stages of training, thus making it harder for an adversary to extract details about the data. However, in the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge. To address this problem, we propose DPFed-post which adds a post-processing stage to the private federated learning scheme. This extra step helps to regulate the magnitude of the noisy average parameter update and easier convergence of the model. For our experiments, we choose 3 real-world datasets in the realistic setting when each health center has only a few hundred records, and we show that DPFed-post successfully increases the performance of the models by an average of up to $17%$ compared to the standard differentially private federated learning scheme.

Cite this Paper


BibTeX
@InProceedings{pmlr-v174-rahimian22a, title = {Practical Challenges in Differentially-Private Federated Survival Analysis of Medical Data}, author = {Rahimian, Shadi and Kerkouche, Raouf and Kurth, Ina and Fritz, Mario}, booktitle = {Proceedings of the Conference on Health, Inference, and Learning}, pages = {411--425}, year = {2022}, editor = {Flores, Gerardo and Chen, George H and Pollard, Tom and Ho, Joyce C and Naumann, Tristan}, volume = {174}, series = {Proceedings of Machine Learning Research}, month = {07--08 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v174/rahimian22a/rahimian22a.pdf}, url = {https://proceedings.mlr.press/v174/rahimian22a.html}, abstract = {Survival analysis or time-to-event analysis aims to model and predict the time it takes for an event of interest to happen in a population or an individual. In the medical context this event might be the time of dying, metastasis, recurrence of cancer, etc. Recently, the use of neural networks that are specifically designed for survival analysis has become more popular and an attractive alternative to more traditional methods. In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of these models. This is crucial in the medical domain since data is scarce and collaboration of multiple health centers is essential to make a conclusive decision about the properties of a treatment or a disease. To ensure the privacy of the datasets, it is common to utilize differential privacy on top of federated learning. Differential privacy acts by introducing random noise to different stages of training, thus making it harder for an adversary to extract details about the data. However, in the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge. To address this problem, we propose DPFed-post which adds a post-processing stage to the private federated learning scheme. This extra step helps to regulate the magnitude of the noisy average parameter update and easier convergence of the model. For our experiments, we choose 3 real-world datasets in the realistic setting when each health center has only a few hundred records, and we show that DPFed-post successfully increases the performance of the models by an average of up to $17%$ compared to the standard differentially private federated learning scheme.} }
Endnote
%0 Conference Paper %T Practical Challenges in Differentially-Private Federated Survival Analysis of Medical Data %A Shadi Rahimian %A Raouf Kerkouche %A Ina Kurth %A Mario Fritz %B Proceedings of the Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2022 %E Gerardo Flores %E George H Chen %E Tom Pollard %E Joyce C Ho %E Tristan Naumann %F pmlr-v174-rahimian22a %I PMLR %P 411--425 %U https://proceedings.mlr.press/v174/rahimian22a.html %V 174 %X Survival analysis or time-to-event analysis aims to model and predict the time it takes for an event of interest to happen in a population or an individual. In the medical context this event might be the time of dying, metastasis, recurrence of cancer, etc. Recently, the use of neural networks that are specifically designed for survival analysis has become more popular and an attractive alternative to more traditional methods. In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of these models. This is crucial in the medical domain since data is scarce and collaboration of multiple health centers is essential to make a conclusive decision about the properties of a treatment or a disease. To ensure the privacy of the datasets, it is common to utilize differential privacy on top of federated learning. Differential privacy acts by introducing random noise to different stages of training, thus making it harder for an adversary to extract details about the data. However, in the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge. To address this problem, we propose DPFed-post which adds a post-processing stage to the private federated learning scheme. This extra step helps to regulate the magnitude of the noisy average parameter update and easier convergence of the model. For our experiments, we choose 3 real-world datasets in the realistic setting when each health center has only a few hundred records, and we show that DPFed-post successfully increases the performance of the models by an average of up to $17%$ compared to the standard differentially private federated learning scheme.
APA
Rahimian, S., Kerkouche, R., Kurth, I. & Fritz, M.. (2022). Practical Challenges in Differentially-Private Federated Survival Analysis of Medical Data. Proceedings of the Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 174:411-425 Available from https://proceedings.mlr.press/v174/rahimian22a.html.

Related Material