Deep EHR: Chronic Disease Prediction Using Medical Notes
Proceedings of the 3rd Machine Learning for Healthcare Conference, PMLR 85:440-464, 2018.
Early detection of preventable diseases is important for better disease management, improved interventions, and more efficient health-care resource allocation. Various machine learning approaches have been developed to exploit information in Electronic Health Record (EHR) for this task. Majority of previous attempts, however, focus on structured fields and lose the vast amount of information in the unstructured notes. In this work we propose a general multi-task framework for disease onset prediction that combines both free-text medical notes and structured information. We compare performance of different deep learning architectures including CNN, LSTM and hierarchical models. In contrast to traditional text-based prediction models, our approach does not require disease specific feature engineering, and can handle negations and numerical values that exist in the text. Our results on a cohort of about 1 million patients show that models using text outperform models using just structured data, and that models capable of using numerical values and negations in the text, in addition to the raw text, further improve performance. Additionally, we compare different visualization methods for medical professionals to interpret model predictions.