Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components


Philipp Geiger, Kun Zhang, Bernhard Schoelkopf, Mingming Gong, Dominik Janzing ;
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1917-1925, 2015.


A widely applied approach to causal inference from a time series X, often referred to as “(linear) Granger causal analysis”, is to simply regress present on past and interpret the regression matrix \hatB causally. However, if there is an unmeasured time series Z that influences X, then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as Z. In this paper we take a different approach: We assume that X together with some hidden Z forms a first order vector autoregressive (VAR) process with transition matrix A, and argue why it is more valid to interpret A causally instead of \hatB. Then we examine under which conditions the most important parts of A are identifiable or almost identifiable from only X. Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X.

Related Material