Causal Discovery and Reinforcement Learning: A Synergistic Integration
Proceedings of The 11th International Conference on Probabilistic Graphical Models, PMLR 186:421-432, 2022.
Both Reinforcement Learning (RL) and Causal Modeling (CM) are indispensable parts in the road for general artificial intelligence, however, they are usually treated separately, despite the fact that both areas can effectively complement each other in problem solving. On one hand, the interventional nature of the data generating process in RL favors the discovery of the underlying causal structure. On the other hand, if an agent knows the possible consequences of its actions, given by causal models, it can make better selections of them, reducing exploration and, therefore, accelerating the learning process. Also, ensuring that such an agent maintains a causal model for the world it operates in, improves interpretability and transfer learning, among other benefits. In this article, we propose a combination strategy to provide an intelligent agent with the ability to simultaneously learn and use causal models in the context of reinforcement learning. The proposed method learns a Causal Dynamic Bayesian Network for each of the agent actions and uses those models to improve the action selection process. To test our algorithm, experiments were performed on a simple synthetic scenario called the “coffee-task". Our method achieves better results in policy learning than a traditional model-free algorithm (Q-Learning), and it also learns the underlying causal models. We believe that the results obtained reveal several interesting and challenging directions for future work.