On the Unlikelihood of D-Separation

Itai Feigenbaum, Devansh Arpit, Shelby Heinecke, Juan Carlos Niebles, Weiran Yao, Huan Wang, Caiming Xiong, Silvio Savarese
Proceedings of The 12th International Conference on Probabilistic Graphical Models, PMLR 246:70-92, 2024.

Abstract

Causal discovery aims to recover a causal graph from data generated by it; constraint-based methods do so by searching for d-separating conditioning sets of nodes. In this paper, we provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist. Our analysis implies poor average-case performance of existing constraint-based methods, except on a vanishingly small class of extremely sparse graphs. We consider a set $V=\{v_1,\ldots,v_n\}$ of nodes, and generate a random DAG $G=(V,E)$ where $v_i \rightarrow v_j \in E$ with i.i.d probability $p_1$ if $i j$. For any d-separable pair of nodes $v_i$ and $v_j$, we provide upper bounds on the probability that a subset of $V\backslash\{v_i,v_j\}$ d-separates the pair, under different subset selection scenarios; our upper bounds decay exponentially fast to $0$ as $|V| \rightarrow \infty$ for any fixed expected density. We then analyze the average-case performance of constraint-based methods, including the PC Algorithm, a variant of the SGS Algorithm called UniformSGS, and also any constraint-based method limited to small conditioning sets (a limitation which holds in most of existing literature). We show that these algorithms usually suffer from low precision or exponential running time on all but extremely sparse graphs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v246-feigenbaum24a, title = {On the Unlikelihood of D-Separation}, author = {Feigenbaum, Itai and Arpit, Devansh and Heinecke, Shelby and Niebles, Juan Carlos and Yao, Weiran and Wang, Huan and Xiong, Caiming and Savarese, Silvio}, booktitle = {Proceedings of The 12th International Conference on Probabilistic Graphical Models}, pages = {70--92}, year = {2024}, editor = {Kwisthout, Johan and Renooij, Silja}, volume = {246}, series = {Proceedings of Machine Learning Research}, month = {11--13 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v246/main/assets/feigenbaum24a/feigenbaum24a.pdf}, url = {https://proceedings.mlr.press/v246/feigenbaum24a.html}, abstract = {Causal discovery aims to recover a causal graph from data generated by it; constraint-based methods do so by searching for d-separating conditioning sets of nodes. In this paper, we provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist. Our analysis implies poor average-case performance of existing constraint-based methods, except on a vanishingly small class of extremely sparse graphs. We consider a set $V=\{v_1,\ldots,v_n\}$ of nodes, and generate a random DAG $G=(V,E)$ where $v_i \rightarrow v_j \in E$ with i.i.d probability $p_1$ if $i j$. For any d-separable pair of nodes $v_i$ and $v_j$, we provide upper bounds on the probability that a subset of $V\backslash\{v_i,v_j\}$ d-separates the pair, under different subset selection scenarios; our upper bounds decay exponentially fast to $0$ as $|V| \rightarrow \infty$ for any fixed expected density. We then analyze the average-case performance of constraint-based methods, including the PC Algorithm, a variant of the SGS Algorithm called UniformSGS, and also any constraint-based method limited to small conditioning sets (a limitation which holds in most of existing literature). We show that these algorithms usually suffer from low precision or exponential running time on all but extremely sparse graphs.} }
Endnote
%0 Conference Paper %T On the Unlikelihood of D-Separation %A Itai Feigenbaum %A Devansh Arpit %A Shelby Heinecke %A Juan Carlos Niebles %A Weiran Yao %A Huan Wang %A Caiming Xiong %A Silvio Savarese %B Proceedings of The 12th International Conference on Probabilistic Graphical Models %C Proceedings of Machine Learning Research %D 2024 %E Johan Kwisthout %E Silja Renooij %F pmlr-v246-feigenbaum24a %I PMLR %P 70--92 %U https://proceedings.mlr.press/v246/feigenbaum24a.html %V 246 %X Causal discovery aims to recover a causal graph from data generated by it; constraint-based methods do so by searching for d-separating conditioning sets of nodes. In this paper, we provide analytic evidence that on large graphs, d-separation is a rare phenomenon, even when guaranteed to exist. Our analysis implies poor average-case performance of existing constraint-based methods, except on a vanishingly small class of extremely sparse graphs. We consider a set $V=\{v_1,\ldots,v_n\}$ of nodes, and generate a random DAG $G=(V,E)$ where $v_i \rightarrow v_j \in E$ with i.i.d probability $p_1$ if $i j$. For any d-separable pair of nodes $v_i$ and $v_j$, we provide upper bounds on the probability that a subset of $V\backslash\{v_i,v_j\}$ d-separates the pair, under different subset selection scenarios; our upper bounds decay exponentially fast to $0$ as $|V| \rightarrow \infty$ for any fixed expected density. We then analyze the average-case performance of constraint-based methods, including the PC Algorithm, a variant of the SGS Algorithm called UniformSGS, and also any constraint-based method limited to small conditioning sets (a limitation which holds in most of existing literature). We show that these algorithms usually suffer from low precision or exponential running time on all but extremely sparse graphs.
APA
Feigenbaum, I., Arpit, D., Heinecke, S., Niebles, J.C., Yao, W., Wang, H., Xiong, C. & Savarese, S.. (2024). On the Unlikelihood of D-Separation. Proceedings of The 12th International Conference on Probabilistic Graphical Models, in Proceedings of Machine Learning Research 246:70-92 Available from https://proceedings.mlr.press/v246/feigenbaum24a.html.

Related Material