[edit]
Decision-Focused Model-based Reinforcement Learning for Reward Transfer
Proceedings of the 9th Machine Learning for Healthcare Conference, PMLR 252, 2024.
Abstract
Model-based reinforcement learning (MBRL) provides a way to learn a transition model of the environment, which can then be used to plan personalized policies for different patient cohorts, and to understand the dynamics involved in the decision-making process. However, standard MBRL algorithms are either sensitive to changes in the reward function or achieve suboptimal performance on the task when the transition model is restricted. Motivated by the need to use simple and interpretable models in critical domains such as healthcare, we propose a novel robust decision-focused (RDF) algorithm that learns a transition model that achieves high returns while being robust to changes in the reward function. We demonstrate our RDF algorithm can be used with several model classes and planning algorithms. We also provide theoretical and empirical envidence, on variety of simulators and real patient data, that RDF can learn simple yet effective models that can be used to plan personalized policies.