Policy Learning for Optimal Individualized Dose Intervals
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:1671-1693, 2022.
We study the problem of learning individualized dose intervals using observational data. There are very few previous works for policy learning with continuous treatment, and all of them focused on recommending an optimal dose rather than an optimal dose interval. In this paper, we propose a new method to estimate such an optimal dose interval, named probability dose interval (PDI). The potential outcomes for doses in the PDI are guaranteed better than a pre-specified threshold with a given probability (e.g., $50%$). The associated nonconvex optimization problem can be efficiently solved by the Difference-of-Convex functions (DC) algorithm. We prove that our estimated policy is consistent, and its risk converges to that of the best-in-class policy at a root-n rate. Numerical simulations show the advantage of the proposed method over outcome modeling based benchmarks. We further demonstrate the performance of our method in determining individualized Hemoglobin A1c (HbA1c) control intervals for elderly patients with diabetes.