Instability in clinical risk stratiﬁcation models using deep learning

Daniel Lopez-Martinez; Alex Yakubovich; Martin Seneviratne; Adam D. Lelkes; Akshit Tyagi; Jonas Kemp; Ethan Steinberg; N. Lance Downing; Ron C. Li; Keith E. Morse; Nigam H. Shah; Ming-Jun Chen

Instability in clinical risk stratiﬁcation models using deep learning

Daniel Lopez-Martinez, Alex Yakubovich, Martin Seneviratne, Adam D. Lelkes, Akshit Tyagi, Jonas Kemp, Ethan Steinberg, N. Lance Downing, Ron C. Li, Keith E. Morse, Nigam H. Shah, Ming-Jun Chen

Proceedings of the 2nd Machine Learning for Health symposium, PMLR 193:552-565, 2022.

Abstract

While it has been well known in the ML community that deep learning models suffer from instability, the consequences for healthcare deployments are under characterised. We study the stability of different model architectures trained on electronic health records, using a set of outpatient prediction tasks as a case study. We show that repeated training runs of the same deep learning model on the same training data can result in significantly different outcomes at a patient level even though global performance metrics remain stable. We propose two stability metrics for measuring the effect of randomness of model training, as well as mitigation strategies for improving model stability.

Cite this Paper

BibTeX


@InProceedings{pmlr-v193-lopez-martinez22a,
  title = 	 {Instability in clinical risk stratiﬁcation models using deep learning},
  author =       {Lopez-Martinez, Daniel and Yakubovich, Alex and Seneviratne, Martin and Lelkes, Adam D. and Tyagi, Akshit and Kemp, Jonas and Steinberg, Ethan and Downing, N. Lance and Li, Ron C. and Morse, Keith E. and Shah, Nigam H. and Chen, Ming-Jun},
  booktitle = 	 {Proceedings of the 2nd Machine Learning for Health symposium},
  pages = 	 {552--565},
  year = 	 {2022},
  editor = 	 {Parziale, Antonio and Agrawal, Monica and Joshi, Shalmali and Chen, Irene Y. and Tang, Shengpu and Oala, Luis and Subbaswamy, Adarsh},
  volume = 	 {193},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {28 Nov},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v193/lopez-martinez22a/lopez-martinez22a.pdf},
  url = 	 {https://proceedings.mlr.press/v193/lopez-martinez22a.html},
  abstract = 	 {While it has been well known in the ML community that deep learning models suffer from instability, the consequences for healthcare deployments are under characterised. We study the stability of different model architectures trained on electronic health records, using a set of outpatient prediction tasks as a case study. We show that repeated training runs of the same deep learning model on the same training data can result in significantly different outcomes at a patient level even though global performance metrics remain stable. We propose two stability metrics for measuring the effect of randomness of model training, as well as mitigation strategies for improving model stability. }
}

Endnote

%0 Conference Paper
%T Instability in clinical risk stratiﬁcation models using deep learning
%A Daniel Lopez-Martinez
%A Alex Yakubovich
%A Martin Seneviratne
%A Adam D. Lelkes
%A Akshit Tyagi
%A Jonas Kemp
%A Ethan Steinberg
%A N. Lance Downing
%A Ron C. Li
%A Keith E. Morse
%A Nigam H. Shah
%A Ming-Jun Chen
%B Proceedings of the 2nd Machine Learning for Health symposium
%C Proceedings of Machine Learning Research
%D 2022
%E Antonio Parziale
%E Monica Agrawal
%E Shalmali Joshi
%E Irene Y. Chen
%E Shengpu Tang
%E Luis Oala
%E Adarsh Subbaswamy	
%F pmlr-v193-lopez-martinez22a
%I PMLR
%P 552--565
%U https://proceedings.mlr.press/v193/lopez-martinez22a.html
%V 193
%X While it has been well known in the ML community that deep learning models suffer from instability, the consequences for healthcare deployments are under characterised. We study the stability of different model architectures trained on electronic health records, using a set of outpatient prediction tasks as a case study. We show that repeated training runs of the same deep learning model on the same training data can result in significantly different outcomes at a patient level even though global performance metrics remain stable. We propose two stability metrics for measuring the effect of randomness of model training, as well as mitigation strategies for improving model stability.

APA


Lopez-Martinez, D., Yakubovich, A., Seneviratne, M., Lelkes, A.D., Tyagi, A., Kemp, J., Steinberg, E., Downing, N.L., Li, R.C., Morse, K.E., Shah, N.H. & Chen, M.. (2022). Instability in clinical risk stratiﬁcation models using deep learning. Proceedings of the 2nd Machine Learning for Health symposium, in Proceedings of Machine Learning Research 193:552-565 Available from https://proceedings.mlr.press/v193/lopez-martinez22a.html.

Related Material

Download PDF