Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

Zongliang Ji, Ziyang Zhang, Xincheng Tan, Matthew Thompson, Anna Goldenberg, Carl Yang, Rahul G. Krishnan, Fan Zhang
Proceedings of the Fifth Machine Learning for Health Symposium, PMLR 297:721-739, 2026.

Abstract

Evidence-based medicine ({EBM}) is central to high-quality care, but remains difficult to implement in fast-paced primary care settings. Physicians face short consultations, increasing patient loads, and lengthy guideline documents that are impractical to consult in real time. To address this gap, we investigate the feasibility of using large language models ({LLM}s) as ambient assistants that surface targeted, evidence-based questions during physician–patient encounters. Our study focuses on question generation rather than question answering, with the aim of scaffolding physician reasoning and integrating guideline-based practice into brief consultations. We implemented two prompting strategies, a zero-shot baseline and a multi-stage reasoning variant, using Gemini 2.5 as the backbone model. We evaluated on a benchmark of 80 de-identified transcripts from real clinical encounters, with six experienced physicians contributing over 90 hours of structured review. Results indicate that while general-purpose {LLM}s are not yet fully reliable, they can produce clinically meaningful and guideline-relevant questions, suggesting significant potential to reduce cognitive burden and make {EBM} more actionable at the point of care.

Cite this Paper


BibTeX
@InProceedings{pmlr-v297-ji26a, title = {Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development}, author = {Ji, Zongliang and Zhang, Ziyang and Tan, Xincheng and Thompson, Matthew and Goldenberg, Anna and Yang, Carl and Krishnan, Rahul G. and Zhang, Fan}, booktitle = {Proceedings of the Fifth Machine Learning for Health Symposium}, pages = {721--739}, year = {2026}, editor = {Argaw, Peniel and Zhang, Haoran and Jabbour, Sarah and Chandak, Payal and Ji, Jerry and Mukherjee, Sumit and Salaudeen, Olawale and Chang, Trenton and Healey, Elizabeth and Gröger, Fabian and Adibi, Amin and Hegselmann, Stefan and Wild, Benjamin and Noori, Ayush}, volume = {297}, series = {Proceedings of Machine Learning Research}, month = {13--14 Dec}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v297/main/assets/ji26a/ji26a.pdf}, url = {https://proceedings.mlr.press/v297/ji26a.html}, abstract = {Evidence-based medicine ({EBM}) is central to high-quality care, but remains difficult to implement in fast-paced primary care settings. Physicians face short consultations, increasing patient loads, and lengthy guideline documents that are impractical to consult in real time. To address this gap, we investigate the feasibility of using large language models ({LLM}s) as ambient assistants that surface targeted, evidence-based questions during physician–patient encounters. Our study focuses on question generation rather than question answering, with the aim of scaffolding physician reasoning and integrating guideline-based practice into brief consultations. We implemented two prompting strategies, a zero-shot baseline and a multi-stage reasoning variant, using Gemini 2.5 as the backbone model. We evaluated on a benchmark of 80 de-identified transcripts from real clinical encounters, with six experienced physicians contributing over 90 hours of structured review. Results indicate that while general-purpose {LLM}s are not yet fully reliable, they can produce clinically meaningful and guideline-relevant questions, suggesting significant potential to reduce cognitive burden and make {EBM} more actionable at the point of care.} }
Endnote
%0 Conference Paper %T Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development %A Zongliang Ji %A Ziyang Zhang %A Xincheng Tan %A Matthew Thompson %A Anna Goldenberg %A Carl Yang %A Rahul G. Krishnan %A Fan Zhang %B Proceedings of the Fifth Machine Learning for Health Symposium %C Proceedings of Machine Learning Research %D 2026 %E Peniel Argaw %E Haoran Zhang %E Sarah Jabbour %E Payal Chandak %E Jerry Ji %E Sumit Mukherjee %E Olawale Salaudeen %E Trenton Chang %E Elizabeth Healey %E Fabian Gröger %E Amin Adibi %E Stefan Hegselmann %E Benjamin Wild %E Ayush Noori %F pmlr-v297-ji26a %I PMLR %P 721--739 %U https://proceedings.mlr.press/v297/ji26a.html %V 297 %X Evidence-based medicine ({EBM}) is central to high-quality care, but remains difficult to implement in fast-paced primary care settings. Physicians face short consultations, increasing patient loads, and lengthy guideline documents that are impractical to consult in real time. To address this gap, we investigate the feasibility of using large language models ({LLM}s) as ambient assistants that surface targeted, evidence-based questions during physician–patient encounters. Our study focuses on question generation rather than question answering, with the aim of scaffolding physician reasoning and integrating guideline-based practice into brief consultations. We implemented two prompting strategies, a zero-shot baseline and a multi-stage reasoning variant, using Gemini 2.5 as the backbone model. We evaluated on a benchmark of 80 de-identified transcripts from real clinical encounters, with six experienced physicians contributing over 90 hours of structured review. Results indicate that while general-purpose {LLM}s are not yet fully reliable, they can produce clinically meaningful and guideline-relevant questions, suggesting significant potential to reduce cognitive burden and make {EBM} more actionable at the point of care.
APA
Ji, Z., Zhang, Z., Tan, X., Thompson, M., Goldenberg, A., Yang, C., Krishnan, R.G. & Zhang, F.. (2026). Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development. Proceedings of the Fifth Machine Learning for Health Symposium, in Proceedings of Machine Learning Research 297:721-739 Available from https://proceedings.mlr.press/v297/ji26a.html.

Related Material