An Efficient Search-and-Score Algorithm for Ancestral Graphs using Multivariate Information Scores for Complex Non-linear and Categorical Data

Nikita Lagrange, Herve Isambert
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:32164-32187, 2025.

Abstract

We propose a greedy search-and-score algorithm for ancestral graphs, which include directed as well as bidirected edges, originating from unobserved latent variables. The normalized likelihood score of ancestral graphs is estimated in terms of multivariate information over relevant "$ac$-connected subset" of vertices, $\boldsymbol{C}$, that are connected through collider paths confined to the ancestor set of $\boldsymbol{C}$. For computational efficiency, the proposed two-step algorithm relies on local information scores limited to the close surrounding vertices of each node (step 1) and edge (step 2). This computational strategy, although restricted to information contributions from $ac$-connected subsets containing up to two-collider paths, is shown to outperform state-of-the-art causal discovery methods on challenging benchmark datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-lagrange25a, title = {An Efficient Search-and-Score Algorithm for Ancestral Graphs using Multivariate Information Scores for Complex Non-linear and Categorical Data}, author = {Lagrange, Nikita and Isambert, Herve}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {32164--32187}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/lagrange25a/lagrange25a.pdf}, url = {https://proceedings.mlr.press/v267/lagrange25a.html}, abstract = {We propose a greedy search-and-score algorithm for ancestral graphs, which include directed as well as bidirected edges, originating from unobserved latent variables. The normalized likelihood score of ancestral graphs is estimated in terms of multivariate information over relevant "$ac$-connected subset" of vertices, $\boldsymbol{C}$, that are connected through collider paths confined to the ancestor set of $\boldsymbol{C}$. For computational efficiency, the proposed two-step algorithm relies on local information scores limited to the close surrounding vertices of each node (step 1) and edge (step 2). This computational strategy, although restricted to information contributions from $ac$-connected subsets containing up to two-collider paths, is shown to outperform state-of-the-art causal discovery methods on challenging benchmark datasets.} }
Endnote
%0 Conference Paper %T An Efficient Search-and-Score Algorithm for Ancestral Graphs using Multivariate Information Scores for Complex Non-linear and Categorical Data %A Nikita Lagrange %A Herve Isambert %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-lagrange25a %I PMLR %P 32164--32187 %U https://proceedings.mlr.press/v267/lagrange25a.html %V 267 %X We propose a greedy search-and-score algorithm for ancestral graphs, which include directed as well as bidirected edges, originating from unobserved latent variables. The normalized likelihood score of ancestral graphs is estimated in terms of multivariate information over relevant "$ac$-connected subset" of vertices, $\boldsymbol{C}$, that are connected through collider paths confined to the ancestor set of $\boldsymbol{C}$. For computational efficiency, the proposed two-step algorithm relies on local information scores limited to the close surrounding vertices of each node (step 1) and edge (step 2). This computational strategy, although restricted to information contributions from $ac$-connected subsets containing up to two-collider paths, is shown to outperform state-of-the-art causal discovery methods on challenging benchmark datasets.
APA
Lagrange, N. & Isambert, H.. (2025). An Efficient Search-and-Score Algorithm for Ancestral Graphs using Multivariate Information Scores for Complex Non-linear and Categorical Data. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:32164-32187 Available from https://proceedings.mlr.press/v267/lagrange25a.html.

Related Material