Modeling "Presentness" of Electronic Health Record Data to Improve Patient State Estimation


Jacob Fauber, Christian R. Shelton ;
Proceedings of the 3rd Machine Learning for Healthcare Conference, PMLR 85:500-513, 2018.


Medical data are not missing at random. The problem is more acute when the observations are over an extended period of time; any particular variable is observed at relatively few time points. We taking missing values to be the norm, and treat “presentness” (the times of observations) as additional features to augment the values observed. A joint model over both avoids the “missing at random” assumption. We use piecewise-constant conditional intensity models (PCIMs) to build a generative model of observation times and values. We demonstrate its effectiveness in reconstruction of monitor readings of patient vitals from sparse EHR data.

Related Material