Expert-In-The-Loop Causal Discovery: Iterative Model Refinement Using Expert Knowledge

Ankur Ankan, Johannes Textor
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:172-183, 2025.

Abstract

Many researchers construct directed acyclic graph (DAG) models manually based on domain knowledge. Although numerous causal discovery algorithms were developed to automatically learn DAGs and other causal models from data, these remain challenging to use due to their tendency to produce results that contradict domain knowledge, among other issues. Here we propose a hybrid, iterative structure learning approach that combines domain knowledge with data-driven insights to assist researchers in constructing DAGs. Our method leverages conditional independence testing to iteratively identify variable pairs where an edge is either missing or superfluous. Based on this information, we can choose to add missing edges with appropriate orientation based on domain knowledge or remove unnecessary ones. We also give a method to rank these missing edges based on their impact on the overall model fit. In a simulation study, we find that this iterative approach to leverage domain knowledge already starts outperforming purely data-driven structure learning if the orientation of new edge is correctly determined in at least two out of three cases. We present a proof-of-concept implementation using a large language model as a domain expert and a graphical user interface designed to assist human experts with DAG construction.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-ankan25a, title = {Expert-In-The-Loop Causal Discovery: Iterative Model Refinement Using Expert Knowledge}, author = {Ankan, Ankur and Textor, Johannes}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {172--183}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/ankan25a/ankan25a.pdf}, url = {https://proceedings.mlr.press/v286/ankan25a.html}, abstract = {Many researchers construct directed acyclic graph (DAG) models manually based on domain knowledge. Although numerous causal discovery algorithms were developed to automatically learn DAGs and other causal models from data, these remain challenging to use due to their tendency to produce results that contradict domain knowledge, among other issues. Here we propose a hybrid, iterative structure learning approach that combines domain knowledge with data-driven insights to assist researchers in constructing DAGs. Our method leverages conditional independence testing to iteratively identify variable pairs where an edge is either missing or superfluous. Based on this information, we can choose to add missing edges with appropriate orientation based on domain knowledge or remove unnecessary ones. We also give a method to rank these missing edges based on their impact on the overall model fit. In a simulation study, we find that this iterative approach to leverage domain knowledge already starts outperforming purely data-driven structure learning if the orientation of new edge is correctly determined in at least two out of three cases. We present a proof-of-concept implementation using a large language model as a domain expert and a graphical user interface designed to assist human experts with DAG construction.} }
Endnote
%0 Conference Paper %T Expert-In-The-Loop Causal Discovery: Iterative Model Refinement Using Expert Knowledge %A Ankur Ankan %A Johannes Textor %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-ankan25a %I PMLR %P 172--183 %U https://proceedings.mlr.press/v286/ankan25a.html %V 286 %X Many researchers construct directed acyclic graph (DAG) models manually based on domain knowledge. Although numerous causal discovery algorithms were developed to automatically learn DAGs and other causal models from data, these remain challenging to use due to their tendency to produce results that contradict domain knowledge, among other issues. Here we propose a hybrid, iterative structure learning approach that combines domain knowledge with data-driven insights to assist researchers in constructing DAGs. Our method leverages conditional independence testing to iteratively identify variable pairs where an edge is either missing or superfluous. Based on this information, we can choose to add missing edges with appropriate orientation based on domain knowledge or remove unnecessary ones. We also give a method to rank these missing edges based on their impact on the overall model fit. In a simulation study, we find that this iterative approach to leverage domain knowledge already starts outperforming purely data-driven structure learning if the orientation of new edge is correctly determined in at least two out of three cases. We present a proof-of-concept implementation using a large language model as a domain expert and a graphical user interface designed to assist human experts with DAG construction.
APA
Ankan, A. & Textor, J.. (2025). Expert-In-The-Loop Causal Discovery: Iterative Model Refinement Using Expert Knowledge. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:172-183 Available from https://proceedings.mlr.press/v286/ankan25a.html.

Related Material