[edit]
Efficient interventional distribution learning in the PAC framework
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:7531-7549, 2022.
Abstract
We consider the problem of efficiently inferring interventional distributions in a causal Bayesian network from a finite number of observations. Let P be a causal model on a set V of observable variables on a given causal graph G. For sets $X,Y \subseteq V$, and setting x to $X$, $P_x(Y)$ denotes the interventional distribution on Y with respect to an intervention x to variables X. Shpitser and Pearl (AAAI 2006), building on the work of Tian and Pearl (AAAI 2001), proved that the ID algorithm is sound and complete for recovering P_x(Y) from observations. We give the first provably efficient version of the ID algorithm. In particular, under natural assumptions, we give a polynomial-time algorithm that on input a causal graph G on observable variables V, a setting x of a set $X \subseteq V$ of bounded size, outputs succinct descriptions of both an evaluator and a generator for a distribution $\hat{P}$ that is epsilon-close (in total variation distance) to $P_x(Y)$ where $Y = V X$, if $P_x(Y)$ is identifiable. We also show that when Y is an arbitrary subset of $V X$, there is no efficient algorithm that outputs an evaluator of a distribution that is epsilon-close to $P_x(Y)$ unless all problems that have statistical zero-knowledge proofs, including the Graph Isomorphism problem, have efficient randomized algorithms.