Optimizing Insulin Dosing for Type 1 Diabetes with Thyroid Dysfunction Using Q-Learning: A Personalized Approach to Chronic Disease Management

Jamell Dacon, Chukwulenyeudo Uwaeme, Chukwuemeka Obasi, Oluwasegun Soji-John, Oluwatobi Olajide, Marissa Savage, Iyinoluwa Ayodele, Oluwajomiloju King, Chelsea Minard, Michael Mosuro, Obaloluwa Wojuade, Nicholaus Somerville, Mikayla Brown, Abdulai Thomas Hallowell, Nyah Nunnally
Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, PMLR 317:85-93, 2026.

Abstract

Thyroid dysfunction frequently coexists with Type 1 Diabetes (T1D), creating complex clinical challenges due to the critical interplay between thyroid hormone fluctuations and insulin sensitivity. Existing insulin dosing protocols typically do not account for these dynamic comorbid interactions, often leading to suboptimal glycemic control and increased adverse event risk. To address this gap and prioritize the clinical interpretability necessary for adoption, we propose a novel Reinforcement Learning (RL) framework based on tabular Q-learning that explicitly models discrete thyroid dysfunction severity within the patient state and incorporates the delayed pharmacodynamic effects of thyroid medications into a dual-objective reward function. This deliberate design enables personalized, transparent insulin dosing policies that optimize both glycemic control and thyroid hormone stabilization. We evaluate our approach on the real-world T1DGranada cohort comprising adults with T1D and hypothyroidism. Our comorbidity-aware, interpretable model achieves a 15% improvement in Time-in-Range (TIR) and a 42% reduction in hypoglycemic events compared to standard clinical baselines, while also significantly enhancing thyroid hormone stabilization rates. Offline evaluation techniques including importance sampling and Fitted Q-Evaluation (FQE) validate the robustness and reliability of the learned policies. Furthermore, expert endocrinologist blind review confirms high clinical alignment with 83% agreement. These results underscore the importance of explicitly modeling multimorbidity and delayed treatment effects in interpretable RL frameworks to advance personalized chronic disease management and facilitate clinical trust and integration.

Cite this Paper


BibTeX
@InProceedings{pmlr-v317-dacon26a, title = {Optimizing Insulin Dosing for Type 1 Diabetes with Thyroid Dysfunction Using Q-Learning: A Personalized Approach to Chronic Disease Management}, author = {Dacon, Jamell and Uwaeme, Chukwulenyeudo and Obasi, Chukwuemeka and Soji-John, Oluwasegun and Olajide, Oluwatobi and Savage, Marissa and Ayodele, Iyinoluwa and King, Oluwajomiloju and Minard, Chelsea and Mosuro, Michael and Wojuade, Obaloluwa and Somerville, Nicholaus and Brown, Mikayla and Hallowell, Abdulai Thomas and Nunnally, Nyah}, booktitle = {Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare}, pages = {85--93}, year = {2026}, editor = {Wu, Junde and Pan, Jiazhen and Zhu, Jiayuan and Luo, Luyang and Li, Yitong and Xu, Min and Jin, Yueming and Rueckert, Daniel}, volume = {317}, series = {Proceedings of Machine Learning Research}, month = {20--21 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v317/main/assets/dacon26a/dacon26a.pdf}, url = {https://proceedings.mlr.press/v317/dacon26a.html}, abstract = {Thyroid dysfunction frequently coexists with Type 1 Diabetes (T1D), creating complex clinical challenges due to the critical interplay between thyroid hormone fluctuations and insulin sensitivity. Existing insulin dosing protocols typically do not account for these dynamic comorbid interactions, often leading to suboptimal glycemic control and increased adverse event risk. To address this gap and prioritize the clinical interpretability necessary for adoption, we propose a novel Reinforcement Learning (RL) framework based on tabular Q-learning that explicitly models discrete thyroid dysfunction severity within the patient state and incorporates the delayed pharmacodynamic effects of thyroid medications into a dual-objective reward function. This deliberate design enables personalized, transparent insulin dosing policies that optimize both glycemic control and thyroid hormone stabilization. We evaluate our approach on the real-world T1DGranada cohort comprising adults with T1D and hypothyroidism. Our comorbidity-aware, interpretable model achieves a 15% improvement in Time-in-Range (TIR) and a 42% reduction in hypoglycemic events compared to standard clinical baselines, while also significantly enhancing thyroid hormone stabilization rates. Offline evaluation techniques including importance sampling and Fitted Q-Evaluation (FQE) validate the robustness and reliability of the learned policies. Furthermore, expert endocrinologist blind review confirms high clinical alignment with 83% agreement. These results underscore the importance of explicitly modeling multimorbidity and delayed treatment effects in interpretable RL frameworks to advance personalized chronic disease management and facilitate clinical trust and integration.} }
Endnote
%0 Conference Paper %T Optimizing Insulin Dosing for Type 1 Diabetes with Thyroid Dysfunction Using Q-Learning: A Personalized Approach to Chronic Disease Management %A Jamell Dacon %A Chukwulenyeudo Uwaeme %A Chukwuemeka Obasi %A Oluwasegun Soji-John %A Oluwatobi Olajide %A Marissa Savage %A Iyinoluwa Ayodele %A Oluwajomiloju King %A Chelsea Minard %A Michael Mosuro %A Obaloluwa Wojuade %A Nicholaus Somerville %A Mikayla Brown %A Abdulai Thomas Hallowell %A Nyah Nunnally %B Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare %C Proceedings of Machine Learning Research %D 2026 %E Junde Wu %E Jiazhen Pan %E Jiayuan Zhu %E Luyang Luo %E Yitong Li %E Min Xu %E Yueming Jin %E Daniel Rueckert %F pmlr-v317-dacon26a %I PMLR %P 85--93 %U https://proceedings.mlr.press/v317/dacon26a.html %V 317 %X Thyroid dysfunction frequently coexists with Type 1 Diabetes (T1D), creating complex clinical challenges due to the critical interplay between thyroid hormone fluctuations and insulin sensitivity. Existing insulin dosing protocols typically do not account for these dynamic comorbid interactions, often leading to suboptimal glycemic control and increased adverse event risk. To address this gap and prioritize the clinical interpretability necessary for adoption, we propose a novel Reinforcement Learning (RL) framework based on tabular Q-learning that explicitly models discrete thyroid dysfunction severity within the patient state and incorporates the delayed pharmacodynamic effects of thyroid medications into a dual-objective reward function. This deliberate design enables personalized, transparent insulin dosing policies that optimize both glycemic control and thyroid hormone stabilization. We evaluate our approach on the real-world T1DGranada cohort comprising adults with T1D and hypothyroidism. Our comorbidity-aware, interpretable model achieves a 15% improvement in Time-in-Range (TIR) and a 42% reduction in hypoglycemic events compared to standard clinical baselines, while also significantly enhancing thyroid hormone stabilization rates. Offline evaluation techniques including importance sampling and Fitted Q-Evaluation (FQE) validate the robustness and reliability of the learned policies. Furthermore, expert endocrinologist blind review confirms high clinical alignment with 83% agreement. These results underscore the importance of explicitly modeling multimorbidity and delayed treatment effects in interpretable RL frameworks to advance personalized chronic disease management and facilitate clinical trust and integration.
APA
Dacon, J., Uwaeme, C., Obasi, C., Soji-John, O., Olajide, O., Savage, M., Ayodele, I., King, O., Minard, C., Mosuro, M., Wojuade, O., Somerville, N., Brown, M., Hallowell, A.T. & Nunnally, N.. (2026). Optimizing Insulin Dosing for Type 1 Diabetes with Thyroid Dysfunction Using Q-Learning: A Personalized Approach to Chronic Disease Management. Proceedings of The Second AAAI Bridge Program on AI for Medicine and Healthcare, in Proceedings of Machine Learning Research 317:85-93 Available from https://proceedings.mlr.press/v317/dacon26a.html.

Related Material