Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach

Mengqian Wang; Ilya Valmianski; Xavier Amatriain; Anitha Kannan

Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach

Mengqian Wang, Ilya Valmianski, Xavier Amatriain, Anitha Kannan

Proceedings of the 8th Machine Learning for Healthcare Conference, PMLR 219:772-787, 2023.

Abstract

Medical conversations between patients and medical professionals have implicit functional sections, such as “history taking”, “summarization”, “education”, and “care plan.” In this work, we are interested in learning to automatically extract these sections. A direct approach would require collecting large amounts of expert annotations for this task, which is inherently costly due to the contextual inter-and-intra variability between these sections. This paper presents an approach that tackles the problem of learning to classify medical dialogue into functional sections without requiring a large number of annotations. Our approach combines pseudo-labeling and human-in-the-loop. First, we bootstrap using weak supervision with pseudo-labeling to generate dialogue turn-level pseudo-labels and train a transformer-based model, which is then applied to individual sentences to create noisy sentence-level labels. Second, we iteratively refine sentence-level labels using a cluster-based human-in-the-loop approach. Each iteration requires only a few dozen annotator decisions. We evaluate the results on an expert-annotated dataset of 100 dialogues and find that while our models start with 69.5% accuracy, we can iteratively improve it to 82.5%. Code used to perform all experiments described in this paper can be found here: https://github.com/curai/curai-research/functional-sections.

Cite this Paper

BibTeX


@InProceedings{pmlr-v219-wang23a,
  title = 	 {Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach},
  author =       {Wang, Mengqian and Valmianski, Ilya and Amatriain, Xavier and Kannan, Anitha},
  booktitle = 	 {Proceedings of the 8th Machine Learning for Healthcare Conference},
  pages = 	 {772--787},
  year = 	 {2023},
  editor = 	 {Deshpande, Kaivalya and Fiterau, Madalina and Joshi, Shalmali and Lipton, Zachary and Ranganath, Rajesh and Urteaga, Iñigo and Yeung, Serene},
  volume = 	 {219},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {11--12 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v219/wang23a/wang23a.pdf},
  url = 	 {https://proceedings.mlr.press/v219/wang23a.html},
  abstract = 	 {Medical conversations between patients and medical professionals have implicit functional sections, such as “history taking”, “summarization”, “education”, and “care plan.” In this work, we are interested in learning to automatically extract these sections. A direct approach would require collecting large amounts of expert annotations for this task, which is inherently costly due to the contextual inter-and-intra variability between these sections. This paper presents an approach that tackles the problem of learning to classify medical dialogue into functional sections without requiring a large number of annotations. Our approach combines pseudo-labeling and human-in-the-loop. First, we bootstrap using weak supervision with pseudo-labeling to generate dialogue turn-level pseudo-labels and train a transformer-based model, which is then applied to individual sentences to create noisy sentence-level labels. Second, we iteratively refine sentence-level labels using a cluster-based human-in-the-loop approach. Each iteration requires only a few dozen annotator decisions. We evaluate the results on an expert-annotated dataset of 100 dialogues and find that while our models start with 69.5% accuracy, we can iteratively improve it to 82.5%. Code used to perform all experiments described in this paper can be found here: https://github.com/curai/curai-research/functional-sections.}
}

Endnote

%0 Conference Paper
%T Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach
%A Mengqian Wang
%A Ilya Valmianski
%A Xavier Amatriain
%A Anitha Kannan
%B Proceedings of the 8th Machine Learning for Healthcare Conference
%C Proceedings of Machine Learning Research
%D 2023
%E Kaivalya Deshpande
%E Madalina Fiterau
%E Shalmali Joshi
%E Zachary Lipton
%E Rajesh Ranganath
%E Iñigo Urteaga
%E Serene Yeung	
%F pmlr-v219-wang23a
%I PMLR
%P 772--787
%U https://proceedings.mlr.press/v219/wang23a.html
%V 219
%X Medical conversations between patients and medical professionals have implicit functional sections, such as “history taking”, “summarization”, “education”, and “care plan.” In this work, we are interested in learning to automatically extract these sections. A direct approach would require collecting large amounts of expert annotations for this task, which is inherently costly due to the contextual inter-and-intra variability between these sections. This paper presents an approach that tackles the problem of learning to classify medical dialogue into functional sections without requiring a large number of annotations. Our approach combines pseudo-labeling and human-in-the-loop. First, we bootstrap using weak supervision with pseudo-labeling to generate dialogue turn-level pseudo-labels and train a transformer-based model, which is then applied to individual sentences to create noisy sentence-level labels. Second, we iteratively refine sentence-level labels using a cluster-based human-in-the-loop approach. Each iteration requires only a few dozen annotator decisions. We evaluate the results on an expert-annotated dataset of 100 dialogues and find that while our models start with 69.5% accuracy, we can iteratively improve it to 82.5%. Code used to perform all experiments described in this paper can be found here: https://github.com/curai/curai-research/functional-sections.

APA


Wang, M., Valmianski, I., Amatriain, X. & Kannan, A.. (2023). Learning functional sections in medical conversations: iterative pseudo-labeling and human-in-the-loop approach. Proceedings of the 8th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 219:772-787 Available from https://proceedings.mlr.press/v219/wang23a.html.

Related Material

Download PDF