Reinforcement Learning For Sepsis Treatment: A Continuous Action Space Solution
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:631-647, 2022.
Sepsis is the leading cause of death in intensive care units. It is challenging to treat sepsis because the optimal treatment is still unclear, and individual patients respond differently to treatments. Recent attempts to use reinforcement learning to provide real-time personalized treatment recommendations have shown promising results. However, the discrete action design (i.e., discretizing the continuum of action space into coarse-grained decisions) poses problems in policy learning and evaluation, and limits the effectiveness of the treatment recommendations. In this work, we proposed a continuous state and action space solution inspired by the Deep Deterministic Policy Gradient (DDPG) algorithm. We performed qualitative evaluations and applied the direct method for off-policy evaluations. Our results match clinician performance and are more clinically reasonable and explainable than the state of the art.