Context-Adaptive Reinforcement Learning using Unsupervised Learning of Context Variables
NeurIPS 2020 Workshop on Pre-registration in Machine Learning, PMLR 148:236-254, 2021.
In Reinforcement Learning (RL), changes in the context often cause a distributional change in the observations of the environment, requiring the agent to adapt to this change. For example, when a new user interacts with a system, the system has to adapt to the needs of the user, which might differ based on the user’s characteristics that are often not observable. In this Contextual Reinforcement Learning (CRL) setting, the agent has to not only recognise and adapt to a context, but also remember previous ones. However, often in CRL the context is unknown, hence a supervised approach for learning to predict the context is not feasible. In this paper, we introduce Context-Adaptive Reinforcement Learning Agent (CARLA), that is capable of learning context variables in an unsupervised manner, and can adapt the policy to the current context. We provide a hypothesis based on the generative process that explains how the context variable relates to the states and observations of an environment. Further, we propose an experimental protocol to test and validate our hypothesis; and compare the performance of the proposed approach with other methods in a CRL environment. Finally, we provide empirical results in support of our hypothesis, demonstrating the effectiveness of CARLA in tackling CRL.