Evaluating Domain Generalization for Survival Analysis in Clinical Studies

Florian Pfisterer, Chris Harbron, Gunther Jansen, Tao Xu
Proceedings of the Conference on Health, Inference, and Learning, PMLR 174:32-47, 2022.

Abstract

Machine learning models are often required to generalize to new populations (domains) unseen during training, which may lead to model underperformance. So far, most research has focused on Domain Generalization methods for image classification tasks, which address the problem by learning domain invariant predictors. In this study, we assess the efficacy of domain generalization methods in survival analysis. The goal is to predict time-to-events such as death or disease progression based on baseline demographic and clinical variables of individuals exposed to medical treatment. We benchmark four domain generalization methods and several conventional/established methods on real world scenarios encountered in clinical practice. This includes tasks such as generalizing between randomized controlled trials to real world data, identification of prognostic models regardless of treatment or disease subtypes. We find that the generalization issue is often not as severe as reported in synthetic scenarios. Furthermore, our results corroborate previous findings that domain generalization often does not consistently outperform classical empirical risk minimization baselines also on low-dimensional data. Finally, to better understand when domain generalization methods can lead to performance gains and thus better outcomes for patients, we quantify the influence of different types of shifts occurring in the data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v174-pfisterer22a, title = {Evaluating Domain Generalization for Survival Analysis in Clinical Studies}, author = {Pfisterer, Florian and Harbron, Chris and Jansen, Gunther and Xu, Tao}, booktitle = {Proceedings of the Conference on Health, Inference, and Learning}, pages = {32--47}, year = {2022}, editor = {Flores, Gerardo and Chen, George H and Pollard, Tom and Ho, Joyce C and Naumann, Tristan}, volume = {174}, series = {Proceedings of Machine Learning Research}, month = {07--08 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v174/pfisterer22a/pfisterer22a.pdf}, url = {https://proceedings.mlr.press/v174/pfisterer22a.html}, abstract = {Machine learning models are often required to generalize to new populations (domains) unseen during training, which may lead to model underperformance. So far, most research has focused on Domain Generalization methods for image classification tasks, which address the problem by learning domain invariant predictors. In this study, we assess the efficacy of domain generalization methods in survival analysis. The goal is to predict time-to-events such as death or disease progression based on baseline demographic and clinical variables of individuals exposed to medical treatment. We benchmark four domain generalization methods and several conventional/established methods on real world scenarios encountered in clinical practice. This includes tasks such as generalizing between randomized controlled trials to real world data, identification of prognostic models regardless of treatment or disease subtypes. We find that the generalization issue is often not as severe as reported in synthetic scenarios. Furthermore, our results corroborate previous findings that domain generalization often does not consistently outperform classical empirical risk minimization baselines also on low-dimensional data. Finally, to better understand when domain generalization methods can lead to performance gains and thus better outcomes for patients, we quantify the influence of different types of shifts occurring in the data.} }
Endnote
%0 Conference Paper %T Evaluating Domain Generalization for Survival Analysis in Clinical Studies %A Florian Pfisterer %A Chris Harbron %A Gunther Jansen %A Tao Xu %B Proceedings of the Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2022 %E Gerardo Flores %E George H Chen %E Tom Pollard %E Joyce C Ho %E Tristan Naumann %F pmlr-v174-pfisterer22a %I PMLR %P 32--47 %U https://proceedings.mlr.press/v174/pfisterer22a.html %V 174 %X Machine learning models are often required to generalize to new populations (domains) unseen during training, which may lead to model underperformance. So far, most research has focused on Domain Generalization methods for image classification tasks, which address the problem by learning domain invariant predictors. In this study, we assess the efficacy of domain generalization methods in survival analysis. The goal is to predict time-to-events such as death or disease progression based on baseline demographic and clinical variables of individuals exposed to medical treatment. We benchmark four domain generalization methods and several conventional/established methods on real world scenarios encountered in clinical practice. This includes tasks such as generalizing between randomized controlled trials to real world data, identification of prognostic models regardless of treatment or disease subtypes. We find that the generalization issue is often not as severe as reported in synthetic scenarios. Furthermore, our results corroborate previous findings that domain generalization often does not consistently outperform classical empirical risk minimization baselines also on low-dimensional data. Finally, to better understand when domain generalization methods can lead to performance gains and thus better outcomes for patients, we quantify the influence of different types of shifts occurring in the data.
APA
Pfisterer, F., Harbron, C., Jansen, G. & Xu, T.. (2022). Evaluating Domain Generalization for Survival Analysis in Clinical Studies. Proceedings of the Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 174:32-47 Available from https://proceedings.mlr.press/v174/pfisterer22a.html.

Related Material