Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning

Jiaru Zhang; Rui Ding; Qiang Fu; Huang Bojun; Zizhen Deng; Yang Hua; Haibing Guan; Shi Han; Dongmei Zhang

Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning

Jiaru Zhang, Rui Ding, Qiang Fu, Huang Bojun, Zizhen Deng, Yang Hua, Haibing Guan, Shi Han, Dongmei Zhang

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:577-585, 2025.

Abstract

Causal discovery is a structured prediction task that aims to predict causal relations among variables based on their data samples. Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the “Node-Edge approach”, in which the model first computes an embedding vector for each variable-node, then uses these variable-wise representations to concurrently and independently predict for each directed causal-edge. In this paper, we first show that this architecture has some systematic bias that cannot be mitigated regardless of model size and data size. We then propose SiCL, a DNN-based SCL method that predicts a skeleton matrix together with a v-tensor (a third-order tensor representing the v-structures). According to the Markov Equivalence Class (MEC) theory, both the skeleton and the v-structures are \emph{identifiable} causal structures under the canonical MEC setting, so predictions about skeleton and v-structures do not suffer from the identifiability limit in causal discovery, thus SiCL can avoid the systematic bias in Node-Edge architecture, and enable consistent estimators for causal discovery. Moreover, SiCL is also equipped with a specially designed pairwise encoder module with a unidirectional attention layer to model both internal and external relationships of pairs of nodes. Experimental results on both synthetic and real-world benchmarks show that SiCL significantly outperforms other DNN-based SCL approaches.

Cite this Paper

BibTeX

@InProceedings{pmlr-v258-zhang25b,
  title = 	 {Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning},
  author =       {Zhang, Jiaru and Ding, Rui and Fu, Qiang and Bojun, Huang and Deng, Zizhen and Hua, Yang and Guan, Haibing and Han, Shi and Zhang, Dongmei},
  booktitle = 	 {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {577--585},
  year = 	 {2025},
  editor = 	 {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz},
  volume = 	 {258},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--05 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v258/main/assets/zhang25b/zhang25b.pdf},
  url = 	 {https://proceedings.mlr.press/v258/zhang25b.html},
  abstract = 	 {Causal discovery is a structured prediction task that aims to predict causal relations among variables based on their data samples. Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the “Node-Edge approach”, in which the model first computes an embedding vector for each variable-node, then uses these variable-wise representations to concurrently and independently predict for each directed causal-edge. In this paper, we first show that this architecture has some systematic bias that cannot be mitigated regardless of model size and data size. We then propose SiCL, a DNN-based SCL method that predicts a skeleton matrix together with a v-tensor (a third-order tensor representing the v-structures). According to the Markov Equivalence Class (MEC) theory, both the skeleton and the v-structures are \emph{identifiable} causal structures under the canonical MEC setting, so predictions about skeleton and v-structures do not suffer from the identifiability limit in causal discovery, thus SiCL can avoid the systematic bias in Node-Edge architecture, and enable consistent estimators for causal discovery. Moreover, SiCL is also equipped with a specially designed pairwise encoder module with a unidirectional attention layer to model both internal and external relationships of pairs of nodes. Experimental results on both synthetic and real-world benchmarks show that SiCL significantly outperforms other DNN-based SCL approaches.}
}

Endnote

%0 Conference Paper
%T Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning
%A Jiaru Zhang
%A Rui Ding
%A Qiang Fu
%A Huang Bojun
%A Zizhen Deng
%A Yang Hua
%A Haibing Guan
%A Shi Han
%A Dongmei Zhang
%B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2025
%E Yingzhen Li
%E Stephan Mandt
%E Shipra Agrawal
%E Emtiyaz Khan	
%F pmlr-v258-zhang25b
%I PMLR
%P 577--585
%U https://proceedings.mlr.press/v258/zhang25b.html
%V 258
%X Causal discovery is a structured prediction task that aims to predict causal relations among variables based on their data samples. Supervised Causal Learning (SCL) is an emerging paradigm in this field. Existing Deep Neural Network (DNN)-based methods commonly adopt the “Node-Edge approach”, in which the model first computes an embedding vector for each variable-node, then uses these variable-wise representations to concurrently and independently predict for each directed causal-edge. In this paper, we first show that this architecture has some systematic bias that cannot be mitigated regardless of model size and data size. We then propose SiCL, a DNN-based SCL method that predicts a skeleton matrix together with a v-tensor (a third-order tensor representing the v-structures). According to the Markov Equivalence Class (MEC) theory, both the skeleton and the v-structures are \emph{identifiable} causal structures under the canonical MEC setting, so predictions about skeleton and v-structures do not suffer from the identifiability limit in causal discovery, thus SiCL can avoid the systematic bias in Node-Edge architecture, and enable consistent estimators for causal discovery. Moreover, SiCL is also equipped with a specially designed pairwise encoder module with a unidirectional attention layer to model both internal and external relationships of pairs of nodes. Experimental results on both synthetic and real-world benchmarks show that SiCL significantly outperforms other DNN-based SCL approaches.

APA

Zhang, J., Ding, R., Fu, Q., Bojun, H., Deng, Z., Hua, Y., Guan, H., Han, S. & Zhang, D.. (2025). Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:577-585 Available from https://proceedings.mlr.press/v258/zhang25b.html.

Learning Identifiable Structures Helps Avoid Bias in DNN-based Supervised Causal Learning

Abstract

Cite this Paper

Related Material