Searching for Fine-Grained Queries in Radiology Reports Using Similarity-Preserving Contrastive Embedding

Tanveer Syeda-Mahmood, Luyao Shi
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:785-799, 2022.

Abstract

The ability to search in unstructured reports of electronic health records requires tools that can recognize clinically meaningful fine-grained descriptions both in queries and in report sentences. Existing methods of searching reports that use either information retrieval or deep learning techniques to model use context, lack an inherent understanding of the clinical concepts or their variants that capture the same underlying clinical semantics. In this paper, we present a new search algorithm that combines principles of information retrieval and deep learning-driven textual encoding approaches with natural language analysis of sentences in reports for fine-grained descriptors of concepts. In particular, we learn a clinical similarity-preserving embedding from a chest X-ray lexicon using a new contrastive loss. This allows us to form a report index that is robust to different forms of expressing for clinical concepts in queries. The results show marked improvement in the quality of retrieved reports as judged through average recall and mean average precision over a broad range of difficult queries.

Cite this Paper


BibTeX
@InProceedings{pmlr-v182-syeda-mahmood22a, title = {Searching for Fine-Grained Queries in Radiology Reports Using Similarity-Preserving Contrastive Embedding}, author = {Syeda-Mahmood, Tanveer and Shi, Luyao}, booktitle = {Proceedings of the 7th Machine Learning for Healthcare Conference}, pages = {785--799}, year = {2022}, editor = {Lipton, Zachary and Ranganath, Rajesh and Sendak, Mark and Sjoding, Michael and Yeung, Serena}, volume = {182}, series = {Proceedings of Machine Learning Research}, month = {05--06 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v182/syeda-mahmood22a/syeda-mahmood22a.pdf}, url = {https://proceedings.mlr.press/v182/syeda-mahmood22a.html}, abstract = {The ability to search in unstructured reports of electronic health records requires tools that can recognize clinically meaningful fine-grained descriptions both in queries and in report sentences. Existing methods of searching reports that use either information retrieval or deep learning techniques to model use context, lack an inherent understanding of the clinical concepts or their variants that capture the same underlying clinical semantics. In this paper, we present a new search algorithm that combines principles of information retrieval and deep learning-driven textual encoding approaches with natural language analysis of sentences in reports for fine-grained descriptors of concepts. In particular, we learn a clinical similarity-preserving embedding from a chest X-ray lexicon using a new contrastive loss. This allows us to form a report index that is robust to different forms of expressing for clinical concepts in queries. The results show marked improvement in the quality of retrieved reports as judged through average recall and mean average precision over a broad range of difficult queries.} }
Endnote
%0 Conference Paper %T Searching for Fine-Grained Queries in Radiology Reports Using Similarity-Preserving Contrastive Embedding %A Tanveer Syeda-Mahmood %A Luyao Shi %B Proceedings of the 7th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2022 %E Zachary Lipton %E Rajesh Ranganath %E Mark Sendak %E Michael Sjoding %E Serena Yeung %F pmlr-v182-syeda-mahmood22a %I PMLR %P 785--799 %U https://proceedings.mlr.press/v182/syeda-mahmood22a.html %V 182 %X The ability to search in unstructured reports of electronic health records requires tools that can recognize clinically meaningful fine-grained descriptions both in queries and in report sentences. Existing methods of searching reports that use either information retrieval or deep learning techniques to model use context, lack an inherent understanding of the clinical concepts or their variants that capture the same underlying clinical semantics. In this paper, we present a new search algorithm that combines principles of information retrieval and deep learning-driven textual encoding approaches with natural language analysis of sentences in reports for fine-grained descriptors of concepts. In particular, we learn a clinical similarity-preserving embedding from a chest X-ray lexicon using a new contrastive loss. This allows us to form a report index that is robust to different forms of expressing for clinical concepts in queries. The results show marked improvement in the quality of retrieved reports as judged through average recall and mean average precision over a broad range of difficult queries.
APA
Syeda-Mahmood, T. & Shi, L.. (2022). Searching for Fine-Grained Queries in Radiology Reports Using Similarity-Preserving Contrastive Embedding. Proceedings of the 7th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 182:785-799 Available from https://proceedings.mlr.press/v182/syeda-mahmood22a.html.

Related Material