DOSSIER: Fact Checking in Electronic Health Records while Preserving Patient Privacy

Haoran Zhang, Supriya Nagesh, Milind Shyani, Nina Mishra
Proceedings of the 9th Machine Learning for Healthcare Conference, PMLR 252, 2024.

Abstract

Given a particular claim about a specific document, the fact checking problem is to determine if the claim is true and, if so, provide corroborating evidence. The problem is motivated by contexts where a document is too lengthy to quickly read and find an answer. This paper focuses on electronic health records, or a medical dossier, where a physician has a pointed claim to make about the record. Prior methods that rely on directly prompting an LLM may suffer from hallucinations and violate privacy constraints. We present a system, DOSSIER, that verifies claims related to the tabular data within a document. For a clinical record, the tables include timestamped vital signs, medications, and labs. DOSSIER weaves together methods for tagging medical entities within a claim, converting natural language to SQL, and utilizing biomedical knowledge graphs, in order to identify rows across multiple tables that prove the answer. A distinguishing and desirable characteristic of DOSSIER is that no private medical records are shared with an LLM. An extensive experimental evaluation is conducted over a large corpus of medical records demonstrating improved accuracy over five baselines. Our methods provide hope that physicians can privately, quickly, and accurately fact check a claim in an evidence-based fashion.

Cite this Paper


BibTeX
@InProceedings{pmlr-v252-zhang24a, title = {{DOSSIER}: Fact Checking in Electronic Health Records while Preserving Patient Privacy}, author = {Zhang, Haoran and Nagesh, Supriya and Shyani, Milind and Mishra, Nina}, booktitle = {Proceedings of the 9th Machine Learning for Healthcare Conference}, year = {2024}, editor = {Deshpande, Kaivalya and Fiterau, Madalina and Joshi, Shalmali and Lipton, Zachary and Ranganath, Rajesh and Urteaga, Iñigo}, volume = {252}, series = {Proceedings of Machine Learning Research}, month = {16--17 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v252/main/assets/zhang24a/zhang24a.pdf}, url = {https://proceedings.mlr.press/v252/zhang24a.html}, abstract = {Given a particular claim about a specific document, the fact checking problem is to determine if the claim is true and, if so, provide corroborating evidence. The problem is motivated by contexts where a document is too lengthy to quickly read and find an answer. This paper focuses on electronic health records, or a medical dossier, where a physician has a pointed claim to make about the record. Prior methods that rely on directly prompting an LLM may suffer from hallucinations and violate privacy constraints. We present a system, DOSSIER, that verifies claims related to the tabular data within a document. For a clinical record, the tables include timestamped vital signs, medications, and labs. DOSSIER weaves together methods for tagging medical entities within a claim, converting natural language to SQL, and utilizing biomedical knowledge graphs, in order to identify rows across multiple tables that prove the answer. A distinguishing and desirable characteristic of DOSSIER is that no private medical records are shared with an LLM. An extensive experimental evaluation is conducted over a large corpus of medical records demonstrating improved accuracy over five baselines. Our methods provide hope that physicians can privately, quickly, and accurately fact check a claim in an evidence-based fashion.} }
Endnote
%0 Conference Paper %T DOSSIER: Fact Checking in Electronic Health Records while Preserving Patient Privacy %A Haoran Zhang %A Supriya Nagesh %A Milind Shyani %A Nina Mishra %B Proceedings of the 9th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2024 %E Kaivalya Deshpande %E Madalina Fiterau %E Shalmali Joshi %E Zachary Lipton %E Rajesh Ranganath %E Iñigo Urteaga %F pmlr-v252-zhang24a %I PMLR %U https://proceedings.mlr.press/v252/zhang24a.html %V 252 %X Given a particular claim about a specific document, the fact checking problem is to determine if the claim is true and, if so, provide corroborating evidence. The problem is motivated by contexts where a document is too lengthy to quickly read and find an answer. This paper focuses on electronic health records, or a medical dossier, where a physician has a pointed claim to make about the record. Prior methods that rely on directly prompting an LLM may suffer from hallucinations and violate privacy constraints. We present a system, DOSSIER, that verifies claims related to the tabular data within a document. For a clinical record, the tables include timestamped vital signs, medications, and labs. DOSSIER weaves together methods for tagging medical entities within a claim, converting natural language to SQL, and utilizing biomedical knowledge graphs, in order to identify rows across multiple tables that prove the answer. A distinguishing and desirable characteristic of DOSSIER is that no private medical records are shared with an LLM. An extensive experimental evaluation is conducted over a large corpus of medical records demonstrating improved accuracy over five baselines. Our methods provide hope that physicians can privately, quickly, and accurately fact check a claim in an evidence-based fashion.
APA
Zhang, H., Nagesh, S., Shyani, M. & Mishra, N.. (2024). DOSSIER: Fact Checking in Electronic Health Records while Preserving Patient Privacy. Proceedings of the 9th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 252 Available from https://proceedings.mlr.press/v252/zhang24a.html.

Related Material