All Required, In Order: Phase-Level Evaluation for AI–Human Dialogue in Healthcare and Beyond

Shubham Kulkarni, Alexander Lyzhov, Shiva Chaitanya, Preetam Joshi
Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, PMLR 317:169-180, 2026.

Abstract

Conversational AI is starting to support real clinical work, but most evaluation methods miss how compliance depends on the full course of a conversation. We introduce Obligatory-Information Phase Structured Compliance Evaluation (OIP–SCE), an evaluation method that checks whether every required clinical obligation is met, in the right order, with clear evidence for clinicians to review. This makes complex rules practical and auditable, helping close the gap between technical progress and what healthcare actually needs. We demonstrate the method in two case studies (respiratory history, benefits verification) and show how phase-level evidence turns policy into shared, actionable steps. By giving clinicians control over what to check and engineers a clear specification to implement, OIP–SCE provides a single, auditable evaluation surface that aligns AI capability with clinical workflow and supports routine, safe use.

Cite this Paper


BibTeX
@InProceedings{pmlr-v317-kulkarni26a, title = {All Required, In Order: Phase-Level Evaluation for AI–Human Dialogue in Healthcare and Beyond}, author = {Kulkarni, Shubham and Lyzhov, Alexander and Chaitanya, Shiva and Joshi, Preetam}, booktitle = {Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare}, pages = {169--180}, year = {2026}, editor = {Wu, Junde and Pan, Jiazhen and Zhu, Jiayuan and Luo, Luyang and Li, Yitong and Xu, Min and Jin, Yueming and Rueckert, Daniel}, volume = {317}, series = {Proceedings of Machine Learning Research}, month = {20--21 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v317/main/assets/kulkarni26a/kulkarni26a.pdf}, url = {https://proceedings.mlr.press/v317/kulkarni26a.html}, abstract = {Conversational AI is starting to support real clinical work, but most evaluation methods miss how compliance depends on the full course of a conversation. We introduce Obligatory-Information Phase Structured Compliance Evaluation (OIP–SCE), an evaluation method that checks whether every required clinical obligation is met, in the right order, with clear evidence for clinicians to review. This makes complex rules practical and auditable, helping close the gap between technical progress and what healthcare actually needs. We demonstrate the method in two case studies (respiratory history, benefits verification) and show how phase-level evidence turns policy into shared, actionable steps. By giving clinicians control over what to check and engineers a clear specification to implement, OIP–SCE provides a single, auditable evaluation surface that aligns AI capability with clinical workflow and supports routine, safe use.} }
Endnote
%0 Conference Paper %T All Required, In Order: Phase-Level Evaluation for AI–Human Dialogue in Healthcare and Beyond %A Shubham Kulkarni %A Alexander Lyzhov %A Shiva Chaitanya %A Preetam Joshi %B Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare %C Proceedings of Machine Learning Research %D 2026 %E Junde Wu %E Jiazhen Pan %E Jiayuan Zhu %E Luyang Luo %E Yitong Li %E Min Xu %E Yueming Jin %E Daniel Rueckert %F pmlr-v317-kulkarni26a %I PMLR %P 169--180 %U https://proceedings.mlr.press/v317/kulkarni26a.html %V 317 %X Conversational AI is starting to support real clinical work, but most evaluation methods miss how compliance depends on the full course of a conversation. We introduce Obligatory-Information Phase Structured Compliance Evaluation (OIP–SCE), an evaluation method that checks whether every required clinical obligation is met, in the right order, with clear evidence for clinicians to review. This makes complex rules practical and auditable, helping close the gap between technical progress and what healthcare actually needs. We demonstrate the method in two case studies (respiratory history, benefits verification) and show how phase-level evidence turns policy into shared, actionable steps. By giving clinicians control over what to check and engineers a clear specification to implement, OIP–SCE provides a single, auditable evaluation surface that aligns AI capability with clinical workflow and supports routine, safe use.
APA
Kulkarni, S., Lyzhov, A., Chaitanya, S. & Joshi, P.. (2026). All Required, In Order: Phase-Level Evaluation for AI–Human Dialogue in Healthcare and Beyond. Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, in Proceedings of Machine Learning Research 317:169-180 Available from https://proceedings.mlr.press/v317/kulkarni26a.html.

Related Material