Deep EHR: Chronic Disease Prediction Using Medical Notes

Jingshu Liu, Zachariah Zhang, Narges Razavian
Proceedings of the 3rd Machine Learning for Healthcare Conference, PMLR 85:440-464, 2018.

Abstract

Early detection of preventable diseases is important for better disease management, improved interventions, and more efficient health-care resource allocation. Various machine learning approaches have been developed to exploit information in Electronic Health Record (EHR) for this task. Majority of previous attempts, however, focus on structured fields and lose the vast amount of information in the unstructured notes. In this work we propose a general multi-task framework for disease onset prediction that combines both free-text medical notes and structured information. We compare performance of different deep learning architectures including CNN, LSTM and hierarchical models. In contrast to traditional text-based prediction models, our approach does not require disease specific feature engineering, and can handle negations and numerical values that exist in the text. Our results on a cohort of about 1 million patients show that models using text outperform models using just structured data, and that models capable of using numerical values and negations in the text, in addition to the raw text, further improve performance. Additionally, we compare different visualization methods for medical professionals to interpret model predictions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v85-liu18b, title = {Deep EHR: Chronic Disease Prediction Using Medical Notes}, author = {Liu, Jingshu and Zhang, Zachariah and Razavian, Narges}, booktitle = {Proceedings of the 3rd Machine Learning for Healthcare Conference}, pages = {440--464}, year = {2018}, editor = {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna}, volume = {85}, series = {Proceedings of Machine Learning Research}, month = {17--18 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v85/liu18b/liu18b.pdf}, url = {https://proceedings.mlr.press/v85/liu18b.html}, abstract = {Early detection of preventable diseases is important for better disease management, improved interventions, and more efficient health-care resource allocation. Various machine learning approaches have been developed to exploit information in Electronic Health Record (EHR) for this task. Majority of previous attempts, however, focus on structured fields and lose the vast amount of information in the unstructured notes. In this work we propose a general multi-task framework for disease onset prediction that combines both free-text medical notes and structured information. We compare performance of different deep learning architectures including CNN, LSTM and hierarchical models. In contrast to traditional text-based prediction models, our approach does not require disease specific feature engineering, and can handle negations and numerical values that exist in the text. Our results on a cohort of about 1 million patients show that models using text outperform models using just structured data, and that models capable of using numerical values and negations in the text, in addition to the raw text, further improve performance. Additionally, we compare different visualization methods for medical professionals to interpret model predictions. } }
Endnote
%0 Conference Paper %T Deep EHR: Chronic Disease Prediction Using Medical Notes %A Jingshu Liu %A Zachariah Zhang %A Narges Razavian %B Proceedings of the 3rd Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2018 %E Finale Doshi-Velez %E Jim Fackler %E Ken Jung %E David Kale %E Rajesh Ranganath %E Byron Wallace %E Jenna Wiens %F pmlr-v85-liu18b %I PMLR %P 440--464 %U https://proceedings.mlr.press/v85/liu18b.html %V 85 %X Early detection of preventable diseases is important for better disease management, improved interventions, and more efficient health-care resource allocation. Various machine learning approaches have been developed to exploit information in Electronic Health Record (EHR) for this task. Majority of previous attempts, however, focus on structured fields and lose the vast amount of information in the unstructured notes. In this work we propose a general multi-task framework for disease onset prediction that combines both free-text medical notes and structured information. We compare performance of different deep learning architectures including CNN, LSTM and hierarchical models. In contrast to traditional text-based prediction models, our approach does not require disease specific feature engineering, and can handle negations and numerical values that exist in the text. Our results on a cohort of about 1 million patients show that models using text outperform models using just structured data, and that models capable of using numerical values and negations in the text, in addition to the raw text, further improve performance. Additionally, we compare different visualization methods for medical professionals to interpret model predictions.
APA
Liu, J., Zhang, Z. & Razavian, N.. (2018). Deep EHR: Chronic Disease Prediction Using Medical Notes. Proceedings of the 3rd Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 85:440-464 Available from https://proceedings.mlr.press/v85/liu18b.html.

Related Material