The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications

Philippe Brouillard, Chandler Squires, Jonas Wahl, Konrad K"ording, Karen Sachs, Alexandre Drouin, Dhanya Sridhar
Proceedings of the Fourth Conference on Causal Learning and Reasoning, PMLR 275:834-873, 2025.

Abstract

Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applications remain limited. Current methods often rely on unrealistic assumptions and are evaluated only on simple synthetic toy datasets, often with inadequate evaluation metrics. In this paper, we substantiate these claims by performing a systematic review of the recent causal discovery literature. We present applications in biology, neuroscience, and Earth sciences—fields where causal discovery holds promise for addressing key challenges. We highlight available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Our goal is to encourage the community to adopt better evaluation practices by utilizing realistic datasets and more adequate metrics.

Cite this Paper


BibTeX
@InProceedings{pmlr-v275-brouillard25a, title = {The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications}, author = {Brouillard, Philippe and Squires, Chandler and Wahl, Jonas and K"{o}rding, Konrad and Sachs, Karen and Drouin, Alexandre and Sridhar, Dhanya}, booktitle = {Proceedings of the Fourth Conference on Causal Learning and Reasoning}, pages = {834--873}, year = {2025}, editor = {Huang, Biwei and Drton, Mathias}, volume = {275}, series = {Proceedings of Machine Learning Research}, month = {07--09 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v275/main/assets/brouillard25a/brouillard25a.pdf}, url = {https://proceedings.mlr.press/v275/brouillard25a.html}, abstract = {Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applications remain limited. Current methods often rely on unrealistic assumptions and are evaluated only on simple synthetic toy datasets, often with inadequate evaluation metrics. In this paper, we substantiate these claims by performing a systematic review of the recent causal discovery literature. We present applications in biology, neuroscience, and Earth sciences—fields where causal discovery holds promise for addressing key challenges. We highlight available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Our goal is to encourage the community to adopt better evaluation practices by utilizing realistic datasets and more adequate metrics.} }
Endnote
%0 Conference Paper %T The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications %A Philippe Brouillard %A Chandler Squires %A Jonas Wahl %A Konrad K"ording %A Karen Sachs %A Alexandre Drouin %A Dhanya Sridhar %B Proceedings of the Fourth Conference on Causal Learning and Reasoning %C Proceedings of Machine Learning Research %D 2025 %E Biwei Huang %E Mathias Drton %F pmlr-v275-brouillard25a %I PMLR %P 834--873 %U https://proceedings.mlr.press/v275/brouillard25a.html %V 275 %X Causal discovery aims to automatically uncover causal relationships from data, a capability with significant potential across many scientific disciplines. However, its real-world applications remain limited. Current methods often rely on unrealistic assumptions and are evaluated only on simple synthetic toy datasets, often with inadequate evaluation metrics. In this paper, we substantiate these claims by performing a systematic review of the recent causal discovery literature. We present applications in biology, neuroscience, and Earth sciences—fields where causal discovery holds promise for addressing key challenges. We highlight available simulated and real-world datasets from these domains and discuss common assumption violations that have spurred the development of new methods. Our goal is to encourage the community to adopt better evaluation practices by utilizing realistic datasets and more adequate metrics.
APA
Brouillard, P., Squires, C., Wahl, J., K"ording, K., Sachs, K., Drouin, A. & Sridhar, D.. (2025). The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications. Proceedings of the Fourth Conference on Causal Learning and Reasoning, in Proceedings of Machine Learning Research 275:834-873 Available from https://proceedings.mlr.press/v275/brouillard25a.html.

Related Material