Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components

Philipp Geiger, Kun Zhang, Bernhard Schoelkopf, Mingming Gong, Dominik Janzing
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1917-1925, 2015.

Abstract

A widely applied approach to causal inference from a time series X, often referred to as “(linear) Granger causal analysis”, is to simply regress present on past and interpret the regression matrix \hatB causally. However, if there is an unmeasured time series Z that influences X, then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as Z. In this paper we take a different approach: We assume that X together with some hidden Z forms a first order vector autoregressive (VAR) process with transition matrix A, and argue why it is more valid to interpret A causally instead of \hatB. Then we examine under which conditions the most important parts of A are identifiable or almost identifiable from only X. Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-geiger15, title = {Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components}, author = {Geiger, Philipp and Zhang, Kun and Schoelkopf, Bernhard and Gong, Mingming and Janzing, Dominik}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {1917--1925}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/geiger15.pdf}, url = {https://proceedings.mlr.press/v37/geiger15.html}, abstract = {A widely applied approach to causal inference from a time series X, often referred to as “(linear) Granger causal analysis”, is to simply regress present on past and interpret the regression matrix \hatB causally. However, if there is an unmeasured time series Z that influences X, then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as Z. In this paper we take a different approach: We assume that X together with some hidden Z forms a first order vector autoregressive (VAR) process with transition matrix A, and argue why it is more valid to interpret A causally instead of \hatB. Then we examine under which conditions the most important parts of A are identifiable or almost identifiable from only X. Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X.} }
Endnote
%0 Conference Paper %T Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components %A Philipp Geiger %A Kun Zhang %A Bernhard Schoelkopf %A Mingming Gong %A Dominik Janzing %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-geiger15 %I PMLR %P 1917--1925 %U https://proceedings.mlr.press/v37/geiger15.html %V 37 %X A widely applied approach to causal inference from a time series X, often referred to as “(linear) Granger causal analysis”, is to simply regress present on past and interpret the regression matrix \hatB causally. However, if there is an unmeasured time series Z that influences X, then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as Z. In this paper we take a different approach: We assume that X together with some hidden Z forms a first order vector autoregressive (VAR) process with transition matrix A, and argue why it is more valid to interpret A causally instead of \hatB. Then we examine under which conditions the most important parts of A are identifiable or almost identifiable from only X. Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X.
RIS
TY - CPAPER TI - Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components AU - Philipp Geiger AU - Kun Zhang AU - Bernhard Schoelkopf AU - Mingming Gong AU - Dominik Janzing BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-geiger15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 1917 EP - 1925 L1 - http://proceedings.mlr.press/v37/geiger15.pdf UR - https://proceedings.mlr.press/v37/geiger15.html AB - A widely applied approach to causal inference from a time series X, often referred to as “(linear) Granger causal analysis”, is to simply regress present on past and interpret the regression matrix \hatB causally. However, if there is an unmeasured time series Z that influences X, then this approach can lead to wrong causal conclusions, i.e., distinct from those one would draw if one had additional information such as Z. In this paper we take a different approach: We assume that X together with some hidden Z forms a first order vector autoregressive (VAR) process with transition matrix A, and argue why it is more valid to interpret A causally instead of \hatB. Then we examine under which conditions the most important parts of A are identifiable or almost identifiable from only X. Essentially, sufficient conditions are (1) non-Gaussian, independent noise or (2) no influence from X to Z. We present two estimation algorithms that are tailored towards conditions (1) and (2), respectively, and evaluate them on synthetic and real-world data. We discuss how to check the model using X. ER -
APA
Geiger, P., Zhang, K., Schoelkopf, B., Gong, M. & Janzing, D.. (2015). Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1917-1925 Available from https://proceedings.mlr.press/v37/geiger15.html.

Related Material