General Identifiability with Arbitrary Surrogate Experiments

Sanghack Lee, Juan D. Correa, Elias Bareinboim
Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR 115:389-398, 2020.

Abstract

We study the problem of causal identification from an arbitrary collection of observational and experimental distributions, and substantive knowledge about the phenomenon under investigation, which usually comes in the form of a causal graph. We call this problem \textit{g-identifiability}, or gID for short. The gID setting encompasses two well-known problems in causal inference, namely, identifiability [Pearl, 1995] and z-identifiability [Bareinboim and Pearl, 2012] – the former assumes that an observational distribution is necessarily available, and no experiments can be performed, conditions that are both relaxed in the gID setting; the latter assumes that \textit{all} combinations of experiments are available, i.e., the power set of the experimental set $\mathbf{Z}$, which gID does not require a priori. In this paper, we introduce a general strategy to prove non-gID based on \textit{hedgelets} and \textit{thickets}, which leads to a necessary and sufficient graphical condition for the corresponding decision problem. We further develop a procedure for systematically computing the target effect, and prove that it is sound and complete for gID instances. In other words, failure of the algorithm in returning an expression implies that the target effect is not computable from the available distributions. Finally, as a corollary of these results, we show that do-calculus is complete for the task of g-identifiability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v115-lee20b, title = {General Identifiability with Arbitrary Surrogate Experiments}, author = {Lee, Sanghack and Correa, Juan D. and Bareinboim, Elias}, booktitle = {Proceedings of The 35th Uncertainty in Artificial Intelligence Conference}, pages = {389--398}, year = {2020}, editor = {Adams, Ryan P. and Gogate, Vibhav}, volume = {115}, series = {Proceedings of Machine Learning Research}, month = {22--25 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v115/lee20b/lee20b.pdf}, url = {https://proceedings.mlr.press/v115/lee20b.html}, abstract = {We study the problem of causal identification from an arbitrary collection of observational and experimental distributions, and substantive knowledge about the phenomenon under investigation, which usually comes in the form of a causal graph. We call this problem \textit{g-identifiability}, or gID for short. The gID setting encompasses two well-known problems in causal inference, namely, identifiability [Pearl, 1995] and z-identifiability [Bareinboim and Pearl, 2012] – the former assumes that an observational distribution is necessarily available, and no experiments can be performed, conditions that are both relaxed in the gID setting; the latter assumes that \textit{all} combinations of experiments are available, i.e., the power set of the experimental set $\mathbf{Z}$, which gID does not require a priori. In this paper, we introduce a general strategy to prove non-gID based on \textit{hedgelets} and \textit{thickets}, which leads to a necessary and sufficient graphical condition for the corresponding decision problem. We further develop a procedure for systematically computing the target effect, and prove that it is sound and complete for gID instances. In other words, failure of the algorithm in returning an expression implies that the target effect is not computable from the available distributions. Finally, as a corollary of these results, we show that do-calculus is complete for the task of g-identifiability.} }
Endnote
%0 Conference Paper %T General Identifiability with Arbitrary Surrogate Experiments %A Sanghack Lee %A Juan D. Correa %A Elias Bareinboim %B Proceedings of The 35th Uncertainty in Artificial Intelligence Conference %C Proceedings of Machine Learning Research %D 2020 %E Ryan P. Adams %E Vibhav Gogate %F pmlr-v115-lee20b %I PMLR %P 389--398 %U https://proceedings.mlr.press/v115/lee20b.html %V 115 %X We study the problem of causal identification from an arbitrary collection of observational and experimental distributions, and substantive knowledge about the phenomenon under investigation, which usually comes in the form of a causal graph. We call this problem \textit{g-identifiability}, or gID for short. The gID setting encompasses two well-known problems in causal inference, namely, identifiability [Pearl, 1995] and z-identifiability [Bareinboim and Pearl, 2012] – the former assumes that an observational distribution is necessarily available, and no experiments can be performed, conditions that are both relaxed in the gID setting; the latter assumes that \textit{all} combinations of experiments are available, i.e., the power set of the experimental set $\mathbf{Z}$, which gID does not require a priori. In this paper, we introduce a general strategy to prove non-gID based on \textit{hedgelets} and \textit{thickets}, which leads to a necessary and sufficient graphical condition for the corresponding decision problem. We further develop a procedure for systematically computing the target effect, and prove that it is sound and complete for gID instances. In other words, failure of the algorithm in returning an expression implies that the target effect is not computable from the available distributions. Finally, as a corollary of these results, we show that do-calculus is complete for the task of g-identifiability.
APA
Lee, S., Correa, J.D. & Bareinboim, E.. (2020). General Identifiability with Arbitrary Surrogate Experiments. Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:389-398 Available from https://proceedings.mlr.press/v115/lee20b.html.

Related Material