Multi-Label Learning from Medical Plain Text with Convolutional Residual Models


Yinyuan Zhang, Ricardo Henao, Zhe Gan, Yitong Li, Lawrence Carin ;
Proceedings of the 3rd Machine Learning for Healthcare Conference, PMLR 85:280-294, 2018.


Predicting diagnoses from Electronic Health Records (EHRs) is an important medical application of multi-label learning. We propose a convolutional residual model for multi-label classification from doctor notes in EHR data. A given patient may have multiple diagnoses, and therefore multi-label learning is required. We employ a Convolutional Neural Network (CNN) to encode plain text into a fixed-length sentence embedding vector. Since diagnoses are typically correlated, a deep residual network is employed on top of the CNN encoder, to capture label (diagnosis) dependencies and incorporate information directly from the encoded sentence vector. A real EHR dataset is considered, and we compare the proposed model with several well-known baselines, to predict diagnoses based on doctor notes. Experimental results demonstrate the superiority of the proposed convolutional residual model.

