Large-Scale Study of Temporal Shift in Health Insurance Claims

Christina X Ji, Ahmed M Alaa, David Sontag
Proceedings of the Conference on Health, Inference, and Learning, PMLR 209:243-278, 2023.

Abstract

Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task—that is, an outcome to be predicted at a particular time point—to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare.

Cite this Paper


BibTeX
@InProceedings{pmlr-v209-ji23a, title = {Large-Scale Study of Temporal Shift in Health Insurance Claims}, author = {Ji, Christina X and Alaa, Ahmed M and Sontag, David}, booktitle = {Proceedings of the Conference on Health, Inference, and Learning}, pages = {243--278}, year = {2023}, editor = {Mortazavi, Bobak J. and Sarker, Tasmie and Beam, Andrew and Ho, Joyce C.}, volume = {209}, series = {Proceedings of Machine Learning Research}, month = {22 Jun--24 Jun}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v209/ji23a/ji23a.pdf}, url = {https://proceedings.mlr.press/v209/ji23a.html}, abstract = {Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task—that is, an outcome to be predicted at a particular time point—to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare.} }
Endnote
%0 Conference Paper %T Large-Scale Study of Temporal Shift in Health Insurance Claims %A Christina X Ji %A Ahmed M Alaa %A David Sontag %B Proceedings of the Conference on Health, Inference, and Learning %C Proceedings of Machine Learning Research %D 2023 %E Bobak J. Mortazavi %E Tasmie Sarker %E Andrew Beam %E Joyce C. Ho %F pmlr-v209-ji23a %I PMLR %P 243--278 %U https://proceedings.mlr.press/v209/ji23a.html %V 209 %X Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task—that is, an outcome to be predicted at a particular time point—to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare.
APA
Ji, C.X., Alaa, A.M. & Sontag, D.. (2023). Large-Scale Study of Temporal Shift in Health Insurance Claims. Proceedings of the Conference on Health, Inference, and Learning, in Proceedings of Machine Learning Research 209:243-278 Available from https://proceedings.mlr.press/v209/ji23a.html.

Related Material