Differentially Private Survival Function Estimation

Lovedeep Gondara, Ke Wang
Proceedings of the 5th Machine Learning for Healthcare Conference, PMLR 126:271-291, 2020.

Abstract

Survival function estimation is used in many disciplines, but it is most common in medical analytics in the form of the Kaplan-Meier estimator. Sensitive data (patient records) is used in the estimation without any explicit control on the information leakage, which is a significant privacy concern. We propose a first differentially private estimator of the survival function and show that it can be easily extended to provide differentially private confidence intervals and test statistics without spending any extra privacy budget. We further provide extensions for differentially private estimation of the competing risk cumulative incidence function, Nelson-Aalen’s estimator for the hazard function, etc. Using eleven real-life clinical datasets, we provide empirical evidence that our proposed method provides good utility while simultaneously providing strong privacy guarantees.

Cite this Paper


BibTeX
@InProceedings{pmlr-v126-gondara20a, title = {Differentially Private Survival Function Estimation}, author = {Gondara, Lovedeep and Wang, Ke}, booktitle = {Proceedings of the 5th Machine Learning for Healthcare Conference}, pages = {271--291}, year = {2020}, editor = {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna}, volume = {126}, series = {Proceedings of Machine Learning Research}, month = {07--08 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v126/gondara20a/gondara20a.pdf}, url = {https://proceedings.mlr.press/v126/gondara20a.html}, abstract = {Survival function estimation is used in many disciplines, but it is most common in medical analytics in the form of the Kaplan-Meier estimator. Sensitive data (patient records) is used in the estimation without any explicit control on the information leakage, which is a significant privacy concern. We propose a first differentially private estimator of the survival function and show that it can be easily extended to provide differentially private confidence intervals and test statistics without spending any extra privacy budget. We further provide extensions for differentially private estimation of the competing risk cumulative incidence function, Nelson-Aalen’s estimator for the hazard function, etc. Using eleven real-life clinical datasets, we provide empirical evidence that our proposed method provides good utility while simultaneously providing strong privacy guarantees.} }
Endnote
%0 Conference Paper %T Differentially Private Survival Function Estimation %A Lovedeep Gondara %A Ke Wang %B Proceedings of the 5th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2020 %E Finale Doshi-Velez %E Jim Fackler %E Ken Jung %E David Kale %E Rajesh Ranganath %E Byron Wallace %E Jenna Wiens %F pmlr-v126-gondara20a %I PMLR %P 271--291 %U https://proceedings.mlr.press/v126/gondara20a.html %V 126 %X Survival function estimation is used in many disciplines, but it is most common in medical analytics in the form of the Kaplan-Meier estimator. Sensitive data (patient records) is used in the estimation without any explicit control on the information leakage, which is a significant privacy concern. We propose a first differentially private estimator of the survival function and show that it can be easily extended to provide differentially private confidence intervals and test statistics without spending any extra privacy budget. We further provide extensions for differentially private estimation of the competing risk cumulative incidence function, Nelson-Aalen’s estimator for the hazard function, etc. Using eleven real-life clinical datasets, we provide empirical evidence that our proposed method provides good utility while simultaneously providing strong privacy guarantees.
APA
Gondara, L. & Wang, K.. (2020). Differentially Private Survival Function Estimation. Proceedings of the 5th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 126:271-291 Available from https://proceedings.mlr.press/v126/gondara20a.html.

Related Material