Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets

Christine W. Bang, Vanessa Didelez
Proceedings of the Fourth Conference on Causal Learning and Reasoning, PMLR 275:1116-1146, 2025.

Abstract

In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information could not be measured at all, or not jointly, as in the case of multiple overlapping datasets. We first present novel insights into the properties of the ’tiered FCI’ (tFCI) algorithm. Building on this, we introduce a new extension of the IOD (integrating overlapping datasets) algorithm incorporating tiered background knowledge, the ’tiered IOD’ (tIOD) algorithm. We show that under full usage of the tiered background knowledge tFCI and tIOD are sound, while simple versions of the tIOD and tFCI are sound and complete. We further show that the tIOD algorithm can often be expected to be considerably more efficient and informative than the IOD algorithm even beyond the obvious restriction of the Markov equivalence classes. We provide a formal result on the conditions for this gain in efficiency and informativeness. Our results are accompanied by a series of examples illustrating the exact role and usefulness of tiered background knowledge.

Cite this Paper


BibTeX
@InProceedings{pmlr-v275-bang25a, title = {Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets}, author = {Bang, Christine W. and Didelez, Vanessa}, booktitle = {Proceedings of the Fourth Conference on Causal Learning and Reasoning}, pages = {1116--1146}, year = {2025}, editor = {Huang, Biwei and Drton, Mathias}, volume = {275}, series = {Proceedings of Machine Learning Research}, month = {07--09 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v275/main/assets/bang25a/bang25a.pdf}, url = {https://proceedings.mlr.press/v275/bang25a.html}, abstract = {In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information could not be measured at all, or not jointly, as in the case of multiple overlapping datasets. We first present novel insights into the properties of the ’tiered FCI’ (tFCI) algorithm. Building on this, we introduce a new extension of the IOD (integrating overlapping datasets) algorithm incorporating tiered background knowledge, the ’tiered IOD’ (tIOD) algorithm. We show that under full usage of the tiered background knowledge tFCI and tIOD are sound, while simple versions of the tIOD and tFCI are sound and complete. We further show that the tIOD algorithm can often be expected to be considerably more efficient and informative than the IOD algorithm even beyond the obvious restriction of the Markov equivalence classes. We provide a formal result on the conditions for this gain in efficiency and informativeness. Our results are accompanied by a series of examples illustrating the exact role and usefulness of tiered background knowledge.} }
Endnote
%0 Conference Paper %T Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets %A Christine W. Bang %A Vanessa Didelez %B Proceedings of the Fourth Conference on Causal Learning and Reasoning %C Proceedings of Machine Learning Research %D 2025 %E Biwei Huang %E Mathias Drton %F pmlr-v275-bang25a %I PMLR %P 1116--1146 %U https://proceedings.mlr.press/v275/bang25a.html %V 275 %X In this paper we consider the use of tiered background knowledge within constraint based causal discovery. Our focus is on settings relaxing causal sufficiency, i.e. allowing for latent variables which may arise because relevant information could not be measured at all, or not jointly, as in the case of multiple overlapping datasets. We first present novel insights into the properties of the ’tiered FCI’ (tFCI) algorithm. Building on this, we introduce a new extension of the IOD (integrating overlapping datasets) algorithm incorporating tiered background knowledge, the ’tiered IOD’ (tIOD) algorithm. We show that under full usage of the tiered background knowledge tFCI and tIOD are sound, while simple versions of the tIOD and tFCI are sound and complete. We further show that the tIOD algorithm can often be expected to be considerably more efficient and informative than the IOD algorithm even beyond the obvious restriction of the Markov equivalence classes. We provide a formal result on the conditions for this gain in efficiency and informativeness. Our results are accompanied by a series of examples illustrating the exact role and usefulness of tiered background knowledge.
APA
Bang, C.W. & Didelez, V.. (2025). Constraint-based causal discovery with tiered background knowledge and latent variables in single or overlapping datasets. Proceedings of the Fourth Conference on Causal Learning and Reasoning, in Proceedings of Machine Learning Research 275:1116-1146 Available from https://proceedings.mlr.press/v275/bang25a.html.

Related Material