RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction

Sameer Khanna, Adam Dejl, Kibo Yoon, Steven QH Truong, Hanh Duong, Agustina Saenz, Pranav Rajpurkar
Proceedings of the 8th Machine Learning for Healthcare Conference, PMLR 219:381-402, 2023.

Abstract

We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for developing automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain.

Cite this Paper


BibTeX
@InProceedings{pmlr-v219-khanna23a, title = {RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction}, author = {Khanna, Sameer and Dejl, Adam and Yoon, Kibo and Truong, Steven QH and Duong, Hanh and Saenz, Agustina and Rajpurkar, Pranav}, booktitle = {Proceedings of the 8th Machine Learning for Healthcare Conference}, pages = {381--402}, year = {2023}, editor = {Deshpande, Kaivalya and Fiterau, Madalina and Joshi, Shalmali and Lipton, Zachary and Ranganath, Rajesh and Urteaga, Iñigo and Yeung, Serene}, volume = {219}, series = {Proceedings of Machine Learning Research}, month = {11--12 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v219/khanna23a/khanna23a.pdf}, url = {https://proceedings.mlr.press/v219/khanna23a.html}, abstract = {We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for developing automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain.} }
Endnote
%0 Conference Paper %T RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction %A Sameer Khanna %A Adam Dejl %A Kibo Yoon %A Steven QH Truong %A Hanh Duong %A Agustina Saenz %A Pranav Rajpurkar %B Proceedings of the 8th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2023 %E Kaivalya Deshpande %E Madalina Fiterau %E Shalmali Joshi %E Zachary Lipton %E Rajesh Ranganath %E Iñigo Urteaga %E Serene Yeung %F pmlr-v219-khanna23a %I PMLR %P 381--402 %U https://proceedings.mlr.press/v219/khanna23a.html %V 219 %X We present RadGraph2, a novel dataset for extracting information from radiology reports that focuses on capturing changes in disease state and device placement over time. We introduce a hierarchical schema that organizes entities based on their relationships and show that using this hierarchy during training improves the performance of an information extraction model. Specifically, we propose a modification to the DyGIE++ framework, resulting in our model HGIE, which outperforms previous models in entity and relation extraction tasks. We demonstrate that RadGraph2 enables models to capture a wider variety of findings and perform better at relation extraction compared to those trained on the original RadGraph dataset. Our work provides the foundation for developing automated systems that can track disease progression over time and develop information extraction models that leverage the natural hierarchy of labels in the medical domain.
APA
Khanna, S., Dejl, A., Yoon, K., Truong, S.Q., Duong, H., Saenz, A. & Rajpurkar, P.. (2023). RadGraph2: Modeling Disease Progression in Radiology Reports via Hierarchical Information Extraction. Proceedings of the 8th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 219:381-402 Available from https://proceedings.mlr.press/v219/khanna23a.html.

Related Material