Learning to Summarize Electronic Health Records Using Cross-Modality Correspondences

Jen J. Gong, John V. Guttag
Proceedings of the 3rd Machine Learning for Healthcare Conference, PMLR 85:551-570, 2018.

Abstract

Electronic Health Records (EHRs) contain an overwhelming amount of information about each patient, making it difficult for clinicians to quickly find the most salient information. Accurate, concise summarization of relevant data can help alleviate this cognitive burden. In practice, clinical narrative notes serve this purpose during the course of care, but they are only intermittently updated and are sometimes missing information. We address this problem by learning to generate topics that should be in summaries of structured health record data at any point during a stay. We use the detailed, high-dimensional structured data to predict existing clinical note topics. Our model can generate topics based on structured health record data, even when a real note does not exist. We demonstrate that using structured data alone, we are able to generate note topics comparable to the performance of using prior notes alone. Our method is also capable of generating the first note in the stay. We demonstrate that our predicted topic distributions are meaningful using the downstream task of predicting in-hospital mortality. We show that our generated note topic vectors perform comparably or even outperform topics from the actual notes on predicting in-hospital mortality.

Cite this Paper


BibTeX
@InProceedings{pmlr-v85-gong18a, title = {Learning to Summarize Electronic Health Records Using Cross-Modality Correspondences}, author = {Gong, Jen J. and Guttag, John V.}, booktitle = {Proceedings of the 3rd Machine Learning for Healthcare Conference}, pages = {551--570}, year = {2018}, editor = {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna}, volume = {85}, series = {Proceedings of Machine Learning Research}, month = {17--18 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v85/gong18a/gong18a.pdf}, url = {https://proceedings.mlr.press/v85/gong18a.html}, abstract = {Electronic Health Records (EHRs) contain an overwhelming amount of information about each patient, making it difficult for clinicians to quickly find the most salient information. Accurate, concise summarization of relevant data can help alleviate this cognitive burden. In practice, clinical narrative notes serve this purpose during the course of care, but they are only intermittently updated and are sometimes missing information. We address this problem by learning to generate topics that should be in summaries of structured health record data at any point during a stay. We use the detailed, high-dimensional structured data to predict existing clinical note topics. Our model can generate topics based on structured health record data, even when a real note does not exist. We demonstrate that using structured data alone, we are able to generate note topics comparable to the performance of using prior notes alone. Our method is also capable of generating the first note in the stay. We demonstrate that our predicted topic distributions are meaningful using the downstream task of predicting in-hospital mortality. We show that our generated note topic vectors perform comparably or even outperform topics from the actual notes on predicting in-hospital mortality.} }
Endnote
%0 Conference Paper %T Learning to Summarize Electronic Health Records Using Cross-Modality Correspondences %A Jen J. Gong %A John V. Guttag %B Proceedings of the 3rd Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2018 %E Finale Doshi-Velez %E Jim Fackler %E Ken Jung %E David Kale %E Rajesh Ranganath %E Byron Wallace %E Jenna Wiens %F pmlr-v85-gong18a %I PMLR %P 551--570 %U https://proceedings.mlr.press/v85/gong18a.html %V 85 %X Electronic Health Records (EHRs) contain an overwhelming amount of information about each patient, making it difficult for clinicians to quickly find the most salient information. Accurate, concise summarization of relevant data can help alleviate this cognitive burden. In practice, clinical narrative notes serve this purpose during the course of care, but they are only intermittently updated and are sometimes missing information. We address this problem by learning to generate topics that should be in summaries of structured health record data at any point during a stay. We use the detailed, high-dimensional structured data to predict existing clinical note topics. Our model can generate topics based on structured health record data, even when a real note does not exist. We demonstrate that using structured data alone, we are able to generate note topics comparable to the performance of using prior notes alone. Our method is also capable of generating the first note in the stay. We demonstrate that our predicted topic distributions are meaningful using the downstream task of predicting in-hospital mortality. We show that our generated note topic vectors perform comparably or even outperform topics from the actual notes on predicting in-hospital mortality.
APA
Gong, J.J. & Guttag, J.V.. (2018). Learning to Summarize Electronic Health Records Using Cross-Modality Correspondences. Proceedings of the 3rd Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 85:551-570 Available from https://proceedings.mlr.press/v85/gong18a.html.

Related Material