Exact discovery is polynomial for certain sparse causal Bayesian networks

Felix Leopoldo Rios, Giusi Moffa, Jack Kuipers
Proceedings of the Fourth Conference on Causal Learning and Reasoning, PMLR 275:631-658, 2025.

Abstract

Causal Bayesian networks are widely used tools for summarising the dependencies between variables and elucidating their putative causal relationships. By restricting the search to trees, for example, learning the optimum from data is polynomial, but this does not guarantee finding the optimal network overall. Without similar restrictions, exact discovery of the optimum is computationally hard in general and no polynomial results are known. The current state-of-the-art approaches are integer linear programming over the underlying space of directed acyclic graphs, dynamic programming and shortest-path searches over the space of topological orders, and constraint programming combining both. For dynamic programming over orders, the computational complexity is known to be exponential base 2 in the number of variables in the network. We demonstrate how to use properties of Bayesian networks to prune the search space and lower the computational cost, while still guaranteeing exact discovery of the provably optimal network. We also include new path-search and divide-and-conquer criteria. Without a priori constraining the search to certain types of networks, the algorithm completes in quadratic time when the optimum is a matching, and in polynomial time when the optimum belongs to any network class with logarithmically-bound largest connected components. In simulation studies we observe the polynomial dependence for sparse networks and that, beyond some critical value, the logarithm of the base grows with the network density. Our approach then out-competes the state-of-the-art at lower densities. These results therefore pave the way for faster exact causal discovery in larger and sparser networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v275-rios25a, title = {Exact discovery is polynomial for certain sparse causal Bayesian networks}, author = {Rios, Felix Leopoldo and Moffa, Giusi and Kuipers, Jack}, booktitle = {Proceedings of the Fourth Conference on Causal Learning and Reasoning}, pages = {631--658}, year = {2025}, editor = {Huang, Biwei and Drton, Mathias}, volume = {275}, series = {Proceedings of Machine Learning Research}, month = {07--09 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v275/main/assets/rios25a/rios25a.pdf}, url = {https://proceedings.mlr.press/v275/rios25a.html}, abstract = {Causal Bayesian networks are widely used tools for summarising the dependencies between variables and elucidating their putative causal relationships. By restricting the search to trees, for example, learning the optimum from data is polynomial, but this does not guarantee finding the optimal network overall. Without similar restrictions, exact discovery of the optimum is computationally hard in general and no polynomial results are known. The current state-of-the-art approaches are integer linear programming over the underlying space of directed acyclic graphs, dynamic programming and shortest-path searches over the space of topological orders, and constraint programming combining both. For dynamic programming over orders, the computational complexity is known to be exponential base 2 in the number of variables in the network. We demonstrate how to use properties of Bayesian networks to prune the search space and lower the computational cost, while still guaranteeing exact discovery of the provably optimal network. We also include new path-search and divide-and-conquer criteria. Without a priori constraining the search to certain types of networks, the algorithm completes in quadratic time when the optimum is a matching, and in polynomial time when the optimum belongs to any network class with logarithmically-bound largest connected components. In simulation studies we observe the polynomial dependence for sparse networks and that, beyond some critical value, the logarithm of the base grows with the network density. Our approach then out-competes the state-of-the-art at lower densities. These results therefore pave the way for faster exact causal discovery in larger and sparser networks.} }
Endnote
%0 Conference Paper %T Exact discovery is polynomial for certain sparse causal Bayesian networks %A Felix Leopoldo Rios %A Giusi Moffa %A Jack Kuipers %B Proceedings of the Fourth Conference on Causal Learning and Reasoning %C Proceedings of Machine Learning Research %D 2025 %E Biwei Huang %E Mathias Drton %F pmlr-v275-rios25a %I PMLR %P 631--658 %U https://proceedings.mlr.press/v275/rios25a.html %V 275 %X Causal Bayesian networks are widely used tools for summarising the dependencies between variables and elucidating their putative causal relationships. By restricting the search to trees, for example, learning the optimum from data is polynomial, but this does not guarantee finding the optimal network overall. Without similar restrictions, exact discovery of the optimum is computationally hard in general and no polynomial results are known. The current state-of-the-art approaches are integer linear programming over the underlying space of directed acyclic graphs, dynamic programming and shortest-path searches over the space of topological orders, and constraint programming combining both. For dynamic programming over orders, the computational complexity is known to be exponential base 2 in the number of variables in the network. We demonstrate how to use properties of Bayesian networks to prune the search space and lower the computational cost, while still guaranteeing exact discovery of the provably optimal network. We also include new path-search and divide-and-conquer criteria. Without a priori constraining the search to certain types of networks, the algorithm completes in quadratic time when the optimum is a matching, and in polynomial time when the optimum belongs to any network class with logarithmically-bound largest connected components. In simulation studies we observe the polynomial dependence for sparse networks and that, beyond some critical value, the logarithm of the base grows with the network density. Our approach then out-competes the state-of-the-art at lower densities. These results therefore pave the way for faster exact causal discovery in larger and sparser networks.
APA
Rios, F.L., Moffa, G. & Kuipers, J.. (2025). Exact discovery is polynomial for certain sparse causal Bayesian networks. Proceedings of the Fourth Conference on Causal Learning and Reasoning, in Proceedings of Machine Learning Research 275:631-658 Available from https://proceedings.mlr.press/v275/rios25a.html.

Related Material