UPSTAGE: Unsupervised Context Augmentation for Utterance Classification in Patient-Provider Communication

Do June Min, Veronica Perez-Rosas, Shihchen Kuo, William H. Herman, Rada Mihalcea
Proceedings of the 5th Machine Learning for Healthcare Conference, PMLR 126:895-912, 2020.

Abstract

Conversations between patients and providers in clinical settings provide a source of natural language data that may reflect and correlate with the patients’ experience and response to the treatment they are receiving. When analyzing utterances in such conversations, it is not sufficient to consider each sentence in isolation, since its context may play a role in determining its semantic meaning. Recently, contextual information in natural language documents has been modeled using various techniques, such as recurrent neural networks with latent variables, or neural networks with attention mechanisms. In this paper, we present UnsuPerviSed conText AuGmEntation (Upstage), a classification framework that relies on both local and global contextual information from different sources. Upstage uses transformer models with pretrained language models and joint sentence representation to solve the task of classifying health topics in patient-provider conversations. In addition, Upstage leverages unlabeled corpora for pretraining and data augmentation to provide additional context, which leads to improved classification performance.

Cite this Paper


BibTeX
@InProceedings{pmlr-v126-min20a, title = {UPSTAGE: Unsupervised Context Augmentation for Utterance Classification in Patient-Provider Communication}, author = {Min, Do June and Perez-Rosas, Veronica and Kuo, Shihchen and Herman, William H. and Mihalcea, Rada}, booktitle = {Proceedings of the 5th Machine Learning for Healthcare Conference}, pages = {895--912}, year = {2020}, editor = {Doshi-Velez, Finale and Fackler, Jim and Jung, Ken and Kale, David and Ranganath, Rajesh and Wallace, Byron and Wiens, Jenna}, volume = {126}, series = {Proceedings of Machine Learning Research}, month = {07--08 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v126/min20a/min20a.pdf}, url = {https://proceedings.mlr.press/v126/min20a.html}, abstract = {Conversations between patients and providers in clinical settings provide a source of natural language data that may reflect and correlate with the patients’ experience and response to the treatment they are receiving. When analyzing utterances in such conversations, it is not sufficient to consider each sentence in isolation, since its context may play a role in determining its semantic meaning. Recently, contextual information in natural language documents has been modeled using various techniques, such as recurrent neural networks with latent variables, or neural networks with attention mechanisms. In this paper, we present UnsuPerviSed conText AuGmEntation (Upstage), a classification framework that relies on both local and global contextual information from different sources. Upstage uses transformer models with pretrained language models and joint sentence representation to solve the task of classifying health topics in patient-provider conversations. In addition, Upstage leverages unlabeled corpora for pretraining and data augmentation to provide additional context, which leads to improved classification performance.} }
Endnote
%0 Conference Paper %T UPSTAGE: Unsupervised Context Augmentation for Utterance Classification in Patient-Provider Communication %A Do June Min %A Veronica Perez-Rosas %A Shihchen Kuo %A William H. Herman %A Rada Mihalcea %B Proceedings of the 5th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2020 %E Finale Doshi-Velez %E Jim Fackler %E Ken Jung %E David Kale %E Rajesh Ranganath %E Byron Wallace %E Jenna Wiens %F pmlr-v126-min20a %I PMLR %P 895--912 %U https://proceedings.mlr.press/v126/min20a.html %V 126 %X Conversations between patients and providers in clinical settings provide a source of natural language data that may reflect and correlate with the patients’ experience and response to the treatment they are receiving. When analyzing utterances in such conversations, it is not sufficient to consider each sentence in isolation, since its context may play a role in determining its semantic meaning. Recently, contextual information in natural language documents has been modeled using various techniques, such as recurrent neural networks with latent variables, or neural networks with attention mechanisms. In this paper, we present UnsuPerviSed conText AuGmEntation (Upstage), a classification framework that relies on both local and global contextual information from different sources. Upstage uses transformer models with pretrained language models and joint sentence representation to solve the task of classifying health topics in patient-provider conversations. In addition, Upstage leverages unlabeled corpora for pretraining and data augmentation to provide additional context, which leads to improved classification performance.
APA
Min, D.J., Perez-Rosas, V., Kuo, S., Herman, W.H. & Mihalcea, R.. (2020). UPSTAGE: Unsupervised Context Augmentation for Utterance Classification in Patient-Provider Communication. Proceedings of the 5th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 126:895-912 Available from https://proceedings.mlr.press/v126/min20a.html.

Related Material