Time-Aware Transformer-based Network for Clinical Notes Series Prediction

Dongyu Zhang; Jidapa Thadajarassiri; Cansu Sen; Elke Rundensteiner

Time-Aware Transformer-based Network for Clinical Notes Series Prediction

Dongyu Zhang, Jidapa Thadajarassiri, Cansu Sen, Elke Rundensteiner

Proceedings of the 5th Machine Learning for Healthcare Conference, PMLR 126:566-588, 2020.

Abstract

A patient’s clinical notes correspond to a sequence of free-form text documents generated by healthcare professionals over time. Rich and unique information in clinical notes is useful for clinical decision making. In this work, we propose a time-aware transformer-based hierarchical architecture, which we call Flexible Time-aware LSTM Transformer (FTL-Trans), for classifying a patient’s health state based on her series of clinical notes. FTL-Trans addresses the problem that current transformer-based architectures cannot handle, which is the multi-level structure inherent in clinical note series where a note contains a sequence of chucks and a chuck contains further a sequence of words. At the bottom layer, FTL-Trans encodes equal-length subsequences of a patient’s clinical notes ("chunks") into content embeddings using a pre-trained ClinicalBERT model. Unlike ClinicalBERT, however, FTL-Trans merges each content embedding and sequential information into a new position-enhanced chunk representation in the second layer by an augmented multi-level position embedding. Next, the time-aware layer tackles the irregularity in the spacing of notes in the note series by learning a flexible time decay function and utilizing the time decay function to incorporate both the position-enhanced chunk embedding and time information into a patient representation. This patient representation is then fed into the top layer for classification. Together, this hierarchical design of FTL-Trans successfully captures the multi-level sequential structure of the note series. Our extensive experimental evaluation conducted using multiple patient cohorts extracted from the MIMIC dataset illustrates that, while addressing the aforementioned issues, FTL-Trans consistently outperforms the state-of-the-art transformer-based architectures up to 5% in AUROC and 6% in Accuracy.

Cite this Paper

BibTeX


@InProceedings{pmlr-v126-zhang20c,
  title = 	 {Time-Aware Transformer-based Network for Clinical Notes Series Prediction},
  author =       {Zhang, Dongyu and Thadajarassiri, Jidapa and Sen, Cansu and Rundensteiner, Elke},
  booktitle = 	 {Proceedings of the 5th Machine Learning for Healthcare Conference},
  pages = 	 {566--588},
  year = 	 {2020},
  editor = 	 {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna},
  volume = 	 {126},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--08 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v126/zhang20c/zhang20c.pdf},
  url = 	 {https://proceedings.mlr.press/v126/zhang20c.html},
  abstract = 	 {A patient’s clinical notes correspond to a sequence of free-form text documents generated by healthcare professionals over time. Rich and unique information in clinical notes is useful for clinical decision making. In this work, we propose a time-aware transformer-based hierarchical architecture, which we call Flexible Time-aware LSTM Transformer (FTL-Trans), for classifying a patient’s health state based on her series of clinical notes. FTL-Trans addresses the problem that current transformer-based architectures cannot handle, which is the multi-level structure inherent in clinical note series where a note contains a sequence of chucks and a chuck contains further a sequence of words. At the bottom layer, FTL-Trans encodes equal-length subsequences of a patient’s clinical notes ("chunks") into content embeddings using a pre-trained ClinicalBERT model. Unlike ClinicalBERT, however, FTL-Trans merges each content embedding and sequential information into a new position-enhanced chunk representation in the second layer by an augmented multi-level position embedding. Next, the time-aware layer tackles the irregularity in the spacing of notes in the note series by learning a flexible time decay function and utilizing the time decay function to incorporate both the position-enhanced chunk embedding and time information into a patient representation. This patient representation is then fed into the top layer for classification. Together, this hierarchical design of FTL-Trans successfully captures the multi-level sequential structure of the note series. Our extensive experimental evaluation conducted using multiple patient cohorts extracted from the MIMIC dataset illustrates that, while addressing the aforementioned issues, FTL-Trans consistently outperforms the state-of-the-art transformer-based architectures up to 5% in AUROC and 6% in Accuracy.}
}

Endnote

%0 Conference Paper
%T Time-Aware Transformer-based Network for Clinical Notes Series Prediction
%A Dongyu Zhang
%A Jidapa Thadajarassiri
%A Cansu Sen
%A Elke Rundensteiner
%B Proceedings of the 5th Machine Learning for Healthcare Conference
%C Proceedings of Machine Learning Research
%D 2020
%E Finale Doshi-Velez
%E Jim Fackler
%E Ken Jung
%E David Kale
%E Rajesh Ranganath
%E Byron Wallace
%E Jenna Wiens	
%F pmlr-v126-zhang20c
%I PMLR
%P 566--588
%U https://proceedings.mlr.press/v126/zhang20c.html
%V 126
%X A patient’s clinical notes correspond to a sequence of free-form text documents generated by healthcare professionals over time. Rich and unique information in clinical notes is useful for clinical decision making. In this work, we propose a time-aware transformer-based hierarchical architecture, which we call Flexible Time-aware LSTM Transformer (FTL-Trans), for classifying a patient’s health state based on her series of clinical notes. FTL-Trans addresses the problem that current transformer-based architectures cannot handle, which is the multi-level structure inherent in clinical note series where a note contains a sequence of chucks and a chuck contains further a sequence of words. At the bottom layer, FTL-Trans encodes equal-length subsequences of a patient’s clinical notes ("chunks") into content embeddings using a pre-trained ClinicalBERT model. Unlike ClinicalBERT, however, FTL-Trans merges each content embedding and sequential information into a new position-enhanced chunk representation in the second layer by an augmented multi-level position embedding. Next, the time-aware layer tackles the irregularity in the spacing of notes in the note series by learning a flexible time decay function and utilizing the time decay function to incorporate both the position-enhanced chunk embedding and time information into a patient representation. This patient representation is then fed into the top layer for classification. Together, this hierarchical design of FTL-Trans successfully captures the multi-level sequential structure of the note series. Our extensive experimental evaluation conducted using multiple patient cohorts extracted from the MIMIC dataset illustrates that, while addressing the aforementioned issues, FTL-Trans consistently outperforms the state-of-the-art transformer-based architectures up to 5% in AUROC and 6% in Accuracy.

APA


Zhang, D., Thadajarassiri, J., Sen, C. & Rundensteiner, E.. (2020). Time-Aware Transformer-based Network for Clinical Notes Series Prediction. Proceedings of the 5th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 126:566-588 Available from https://proceedings.mlr.press/v126/zhang20c.html.

Related Material

Download PDF