Point Processes for Competing Observations with Recurrent Networks (POPCORN): A Generative Model of EHR Data

Shreyas Bhave, Adler Perotte
Proceedings of the 6th Machine Learning for Healthcare Conference, PMLR 149:770-789, 2021.

Abstract

Modeling EHR data is of significant interest in a broad range of applications including prediction of future conditions or building latent representations of patient history. This can be challenging because EHR data is multivariate and irregularly sampled. Traditional treatments of EHR data involve handling irregular sampling by imputation or discretization. In this work, we model the full longitudinal history of a patient using a generative multivariate point process that simultaneously: (1) Models irregularly sampled events probabilistically without discretization or interpolation (2) Has a closed-form likelihood, making training straightforward (3) Encodes dependence between times and events with an approach inspired by competing risk models (4) Allows for direct sampling. We show improved performance on next-event prediction compared to existing approaches. Our pro- posed framework could potentially be used in many different contexts including prediction, generation of synthetic data and building latent representations of patient history.

Cite this Paper


BibTeX
@InProceedings{pmlr-v149-bhave21a, title = {Point Processes for Competing Observations with Recurrent Networks (POPCORN): A Generative Model of EHR Data}, author = {Bhave, Shreyas and Perotte, Adler}, booktitle = {Proceedings of the 6th Machine Learning for Healthcare Conference}, pages = {770--789}, year = {2021}, editor = {Jung, Ken and Yeung, Serena and Sendak, Mark and Sjoding, Michael and Ranganath, Rajesh}, volume = {149}, series = {Proceedings of Machine Learning Research}, month = {06--07 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v149/bhave21a/bhave21a.pdf}, url = {https://proceedings.mlr.press/v149/bhave21a.html}, abstract = {Modeling EHR data is of significant interest in a broad range of applications including prediction of future conditions or building latent representations of patient history. This can be challenging because EHR data is multivariate and irregularly sampled. Traditional treatments of EHR data involve handling irregular sampling by imputation or discretization. In this work, we model the full longitudinal history of a patient using a generative multivariate point process that simultaneously: (1) Models irregularly sampled events probabilistically without discretization or interpolation (2) Has a closed-form likelihood, making training straightforward (3) Encodes dependence between times and events with an approach inspired by competing risk models (4) Allows for direct sampling. We show improved performance on next-event prediction compared to existing approaches. Our pro- posed framework could potentially be used in many different contexts including prediction, generation of synthetic data and building latent representations of patient history.} }
Endnote
%0 Conference Paper %T Point Processes for Competing Observations with Recurrent Networks (POPCORN): A Generative Model of EHR Data %A Shreyas Bhave %A Adler Perotte %B Proceedings of the 6th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2021 %E Ken Jung %E Serena Yeung %E Mark Sendak %E Michael Sjoding %E Rajesh Ranganath %F pmlr-v149-bhave21a %I PMLR %P 770--789 %U https://proceedings.mlr.press/v149/bhave21a.html %V 149 %X Modeling EHR data is of significant interest in a broad range of applications including prediction of future conditions or building latent representations of patient history. This can be challenging because EHR data is multivariate and irregularly sampled. Traditional treatments of EHR data involve handling irregular sampling by imputation or discretization. In this work, we model the full longitudinal history of a patient using a generative multivariate point process that simultaneously: (1) Models irregularly sampled events probabilistically without discretization or interpolation (2) Has a closed-form likelihood, making training straightforward (3) Encodes dependence between times and events with an approach inspired by competing risk models (4) Allows for direct sampling. We show improved performance on next-event prediction compared to existing approaches. Our pro- posed framework could potentially be used in many different contexts including prediction, generation of synthetic data and building latent representations of patient history.
APA
Bhave, S. & Perotte, A.. (2021). Point Processes for Competing Observations with Recurrent Networks (POPCORN): A Generative Model of EHR Data. Proceedings of the 6th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 149:770-789 Available from https://proceedings.mlr.press/v149/bhave21a.html.

Related Material