Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:938-947, 2018.
Conditional independence testing is a fundamental problem underlying causal discovery and a particularly challenging task in the presence of nonlinear dependencies. Here a fully non-parametric test for continuous data based on conditional mutual information combined with a local permutation scheme is presented. Numerical experiments covering sample sizes from $50$ to $2,000$ and dimensions up to $10$ demonstrate that the test reliably generates the null distribution. For smooth nonlinear dependencies, the test has higher power than kernel-based tests in lower dimensions and similar or slightly lower power in higher dimensions. For highly non-smooth densities the data-adaptive nearest neighbor approach is particularly well-suited while kernel methods yield much lower power. The experiments also show that kernel methods utilizing an analytical approximation of the null distribution are not well-calibrated for sample sizes below $1,000$. Combining the local permutation scheme with these kernel tests leads to better calibration but lower power. For smaller sample sizes and lower dimensions, the proposed test is faster than random fourier feature-based kernel tests if (embarrassingly) parallelized, but the runtime increases more sharply with sample size and dimensionality. Thus, more theoretical research to analytically approximate the null distribution and speed up the estimation is desirable. As illustrated on real data here, the test is ideally suited in combination with causal discovery algorithms.