CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support

Yuting Zhang, Karina V. Bunting, Asgher Champsi, Xiaoxia Wang, Wenqi Lu, Alexander Thorley, Sandeep S Hothi, Zhaowen Qiu, Baturalp Buyukates, Dipak Kotecha, Jinming Duan
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:2143-2170, 2026.

Abstract

Cardiovascular diseases (CVDs) remain the foremost cause of mortality worldwide, a burden worsened by a severe deficit of healthcare workers. Artificial intelligence (AI) agents have shown potential to alleviate this gap through automated detection and proactive screening, yet their clinical application remains limited by: (1) rigid sequential workflows, whereas clinical care often requires adaptive reasoning that selects specific tests and, based on their results, guides personalised next steps; (2) reliance solely on intrinsic model capabilities to perform role assignment without domain-specific tool support; (3) general and static knowledge bases without continuous learning capability; and (4) fixed unimodal or bimodal inputs and lack of on-demand visual outputs when clinicians require visual clarification. In response, a multimodal framework, CardAIc-Agents, is proposed to augment models with external tools and adaptively support diverse cardiac tasks. First, a CardiacRAG agent generates task-aware plans from updatable cardiac knowledge, while the Chief agent integrates tools to autonomously execute these plans and deliver decisions. Second, to enable adaptive and case-specific customization, a stepwise update strategy is developed to dynamically refine plans based on preceding execution results, once the task is assessed as complex. Third, a multidisciplinary discussion team is proposed which is automatically invoked to interpret challenging cases, thereby supporting further adaptation. In addition, visual review panels are provided to assist validation when clinicians raise concerns. Experiments across three datasets showed the efficiency of CardAIc-Agents compared to mainstream Vision–Language Models (VLMs) and state-of-the-art agentic systems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v315-zhang26a, title = {CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support}, author = {Zhang, Yuting and Bunting, Karina V. and Champsi, Asgher and Wang, Xiaoxia and Lu, Wenqi and Thorley, Alexander and Hothi, Sandeep S and Qiu, Zhaowen and Buyukates, Baturalp and Kotecha, Dipak and Duan, Jinming}, booktitle = {Proceedings of The 9th International Conference on Medical Imaging with Deep Learning}, pages = {2143--2170}, year = {2026}, editor = {Huo, Yuankai and Gao, Mingchen and Kuo, Chang-Fu and Jin, Yueming and Deng, Ruining}, volume = {315}, series = {Proceedings of Machine Learning Research}, month = {08--10 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v315/main/assets/zhang26a/zhang26a.pdf}, url = {https://proceedings.mlr.press/v315/zhang26a.html}, abstract = {Cardiovascular diseases (CVDs) remain the foremost cause of mortality worldwide, a burden worsened by a severe deficit of healthcare workers. Artificial intelligence (AI) agents have shown potential to alleviate this gap through automated detection and proactive screening, yet their clinical application remains limited by: (1) rigid sequential workflows, whereas clinical care often requires adaptive reasoning that selects specific tests and, based on their results, guides personalised next steps; (2) reliance solely on intrinsic model capabilities to perform role assignment without domain-specific tool support; (3) general and static knowledge bases without continuous learning capability; and (4) fixed unimodal or bimodal inputs and lack of on-demand visual outputs when clinicians require visual clarification. In response, a multimodal framework, CardAIc-Agents, is proposed to augment models with external tools and adaptively support diverse cardiac tasks. First, a CardiacRAG agent generates task-aware plans from updatable cardiac knowledge, while the Chief agent integrates tools to autonomously execute these plans and deliver decisions. Second, to enable adaptive and case-specific customization, a stepwise update strategy is developed to dynamically refine plans based on preceding execution results, once the task is assessed as complex. Third, a multidisciplinary discussion team is proposed which is automatically invoked to interpret challenging cases, thereby supporting further adaptation. In addition, visual review panels are provided to assist validation when clinicians raise concerns. Experiments across three datasets showed the efficiency of CardAIc-Agents compared to mainstream Vision–Language Models (VLMs) and state-of-the-art agentic systems.} }
Endnote
%0 Conference Paper %T CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support %A Yuting Zhang %A Karina V. Bunting %A Asgher Champsi %A Xiaoxia Wang %A Wenqi Lu %A Alexander Thorley %A Sandeep S Hothi %A Zhaowen Qiu %A Baturalp Buyukates %A Dipak Kotecha %A Jinming Duan %B Proceedings of The 9th International Conference on Medical Imaging with Deep Learning %C Proceedings of Machine Learning Research %D 2026 %E Yuankai Huo %E Mingchen Gao %E Chang-Fu Kuo %E Yueming Jin %E Ruining Deng %F pmlr-v315-zhang26a %I PMLR %P 2143--2170 %U https://proceedings.mlr.press/v315/zhang26a.html %V 315 %X Cardiovascular diseases (CVDs) remain the foremost cause of mortality worldwide, a burden worsened by a severe deficit of healthcare workers. Artificial intelligence (AI) agents have shown potential to alleviate this gap through automated detection and proactive screening, yet their clinical application remains limited by: (1) rigid sequential workflows, whereas clinical care often requires adaptive reasoning that selects specific tests and, based on their results, guides personalised next steps; (2) reliance solely on intrinsic model capabilities to perform role assignment without domain-specific tool support; (3) general and static knowledge bases without continuous learning capability; and (4) fixed unimodal or bimodal inputs and lack of on-demand visual outputs when clinicians require visual clarification. In response, a multimodal framework, CardAIc-Agents, is proposed to augment models with external tools and adaptively support diverse cardiac tasks. First, a CardiacRAG agent generates task-aware plans from updatable cardiac knowledge, while the Chief agent integrates tools to autonomously execute these plans and deliver decisions. Second, to enable adaptive and case-specific customization, a stepwise update strategy is developed to dynamically refine plans based on preceding execution results, once the task is assessed as complex. Third, a multidisciplinary discussion team is proposed which is automatically invoked to interpret challenging cases, thereby supporting further adaptation. In addition, visual review panels are provided to assist validation when clinicians raise concerns. Experiments across three datasets showed the efficiency of CardAIc-Agents compared to mainstream Vision–Language Models (VLMs) and state-of-the-art agentic systems.
APA
Zhang, Y., Bunting, K.V., Champsi, A., Wang, X., Lu, W., Thorley, A., Hothi, S.S., Qiu, Z., Buyukates, B., Kotecha, D. & Duan, J.. (2026). CardAIc-Agents: A Multimodal Framework with Hierarchical Adaptation for Cardiac Care Support. Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 315:2143-2170 Available from https://proceedings.mlr.press/v315/zhang26a.html.

Related Material