A Meta-Reinforcement Learning Algorithm for Causal Discovery
Proceedings of the Second Conference on Causal Learning and Reasoning, PMLR 213:602-619, 2023.
Uncovering the underlying causal structure of a phenomenon, domain or environment is of great scientific interest, not least because of the inferences that can be derived from such structures. Unfortunately though, given an environment, identifying its causal structure poses significant challenges. Amongst those are the need for costly interventions and the size of the space of possible structures that has to be searched. In this work, we propose a meta-reinforcement learning setup that addresses these challenges by learning a causal discovery algorithm, called Meta-Causal Discovery, or MCD. We model this algorithm as a policy that is trained on a set of environments with known causal structures to perform budgeted interventions. Simultaneously, the policy learns to maintain an estimate of the environment’s causal structure. The learned policy can then be used as a causal discovery algorithm to estimate the structure of environments in a matter of milliseconds. At test time, our algorithm performs well even in environments that induce previously unseen causal structures. We empirically show that MCD estimates good graphs compared to SOTA approaches on toy environments and thus constitutes a proof-of-concept of learning causal discovery algorithms.