Self-Compatibility: Evaluating Causal Discovery without Ground Truth

Philipp M. Faller, Leena C. Vankadara, Atalanti A. Mastakouri, Francesco Locatello, Dominik Janzing
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4132-4140, 2024.

Abstract

As causal ground truth is incredibly rare, causal discovery algorithms are commonly only evaluated on simulated data. This is concerning, given that simulations reflect preconceptions about generating processes regarding noise distributions, model classes, and more. In this work, we propose a novel method for falsifying the output of a causal discovery algorithm in the absence of ground truth. Our key insight is that while statistical learning seeks stability across subsets of data points, causal learning should seek stability across subsets of variables. Motivated by this insight, our method relies on a notion of compatibility between causal graphs learned on different subsets of variables. We prove that detecting incompatibilities can falsify wrongly inferred causal relations due to violation of assumptions or errors from finite sample effects. Although passing such compatibility tests is only a necessary criterion for good performance, we argue that it provides strong evidence for the causal models whenever compatibility entails strong implications for the joint distribution. We also demonstrate experimentally that detection of incompatibilities can aid in causal model selection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-faller24a, title = {Self-Compatibility: Evaluating Causal Discovery without Ground Truth}, author = {Faller, Philipp M. and Vankadara, Leena C. and Mastakouri, Atalanti A. and Locatello, Francesco and Janzing, Dominik}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {4132--4140}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/faller24a/faller24a.pdf}, url = {https://proceedings.mlr.press/v238/faller24a.html}, abstract = {As causal ground truth is incredibly rare, causal discovery algorithms are commonly only evaluated on simulated data. This is concerning, given that simulations reflect preconceptions about generating processes regarding noise distributions, model classes, and more. In this work, we propose a novel method for falsifying the output of a causal discovery algorithm in the absence of ground truth. Our key insight is that while statistical learning seeks stability across subsets of data points, causal learning should seek stability across subsets of variables. Motivated by this insight, our method relies on a notion of compatibility between causal graphs learned on different subsets of variables. We prove that detecting incompatibilities can falsify wrongly inferred causal relations due to violation of assumptions or errors from finite sample effects. Although passing such compatibility tests is only a necessary criterion for good performance, we argue that it provides strong evidence for the causal models whenever compatibility entails strong implications for the joint distribution. We also demonstrate experimentally that detection of incompatibilities can aid in causal model selection.} }
Endnote
%0 Conference Paper %T Self-Compatibility: Evaluating Causal Discovery without Ground Truth %A Philipp M. Faller %A Leena C. Vankadara %A Atalanti A. Mastakouri %A Francesco Locatello %A Dominik Janzing %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-faller24a %I PMLR %P 4132--4140 %U https://proceedings.mlr.press/v238/faller24a.html %V 238 %X As causal ground truth is incredibly rare, causal discovery algorithms are commonly only evaluated on simulated data. This is concerning, given that simulations reflect preconceptions about generating processes regarding noise distributions, model classes, and more. In this work, we propose a novel method for falsifying the output of a causal discovery algorithm in the absence of ground truth. Our key insight is that while statistical learning seeks stability across subsets of data points, causal learning should seek stability across subsets of variables. Motivated by this insight, our method relies on a notion of compatibility between causal graphs learned on different subsets of variables. We prove that detecting incompatibilities can falsify wrongly inferred causal relations due to violation of assumptions or errors from finite sample effects. Although passing such compatibility tests is only a necessary criterion for good performance, we argue that it provides strong evidence for the causal models whenever compatibility entails strong implications for the joint distribution. We also demonstrate experimentally that detection of incompatibilities can aid in causal model selection.
APA
Faller, P.M., Vankadara, L.C., Mastakouri, A.A., Locatello, F. & Janzing, D.. (2024). Self-Compatibility: Evaluating Causal Discovery without Ground Truth. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4132-4140 Available from https://proceedings.mlr.press/v238/faller24a.html.

Related Material