Directing Human Attention in Event Localization for Clinical Timeline Creation
Proceedings of the 6th Machine Learning for Healthcare Conference, PMLR 149:80-102, 2021.
Many variables useful for clinical research (e.g. patient disease state, treatment regimens) are trapped in free-text clinical notes. Structuring such variables for downstream use typically involves a tedious process in which domain experts manually search through long clinical timelines. Natural language processing systems present an opportunity for automating this workflow, but algorithms still have trouble accurately parsing the most complex patient cases, which may be best deferred to experts. In this work, we present a framework that automatically structures simple patient cases, but when required, iteratively requests human input, specifically a label for a single note in the patient’s timeline that would decrease uncertainty in model output. Our method provides a lightweight way to leverage domain experts. We test our system on two tasks from a cohort of oncology patients: identification of the date of (i) metastasis onset and (ii) oral therapy start. Compared to standard search heuristics, we show we can reduce 80% of model errors with less than 15% of the manual annotation effort that may otherwise be required.