Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations

Zhengming Chen; Yewei Xia; Feng Xie; Jie Qiao; Zhifeng Hao; Ruichu Cai; Kun Zhang

Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations

Zhengming Chen, Yewei Xia, Feng Xie, Jie Qiao, Zhifeng Hao, Ruichu Cai, Kun Zhang

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:9415-9442, 2025.

Abstract

We study the problem of learning discrete latent variable causal structures from mixed-type observational data. Traditional methods, such as those based on the tensor rank condition, are designed to identify discrete latent structure models and provide robust identification bounds for discrete causal models. However, when observed variables—specifically, those representing the children of latent variables—are collected at various levels with continuous data types, the tensor rank condition is not applicable, limiting further causal structure learning for latent variables. In this paper, we consider a more general case where observed variables can be either continuous or discrete, and further allow for scenarios where multiple latent parents cause the same set of observed variables. We show that, under the completeness condition, it is possible to discretize the data in a way that satisfies the full-rank assumption required by the tensor rank condition. This enables the identifiability of discrete latent structure models within mixed-type observational data. Moreover, we introduce the two-sufficient measurement condition, a more general structural assumption under which the tensor rank condition holds and the underlying latent causal structure is identifiable by a proposed two-stage identification algorithm. Extensive experiments on both simulated and real-world data validate the effectiveness of our method.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-chen25bv,
  title = 	 {Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations},
  author =       {Chen, Zhengming and Xia, Yewei and Xie, Feng and Qiao, Jie and Hao, Zhifeng and Cai, Ruichu and Zhang, Kun},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {9415--9442},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/chen25bv/chen25bv.pdf},
  url = 	 {https://proceedings.mlr.press/v267/chen25bv.html},
  abstract = 	 {We study the problem of learning discrete latent variable causal structures from mixed-type observational data. Traditional methods, such as those based on the tensor rank condition, are designed to identify discrete latent structure models and provide robust identification bounds for discrete causal models. However, when observed variables—specifically, those representing the children of latent variables—are collected at various levels with continuous data types, the tensor rank condition is not applicable, limiting further causal structure learning for latent variables. In this paper, we consider a more general case where observed variables can be either continuous or discrete, and further allow for scenarios where multiple latent parents cause the same set of observed variables. We show that, under the completeness condition, it is possible to discretize the data in a way that satisfies the full-rank assumption required by the tensor rank condition. This enables the identifiability of discrete latent structure models within mixed-type observational data. Moreover, we introduce the two-sufficient measurement condition, a more general structural assumption under which the tensor rank condition holds and the underlying latent causal structure is identifiable by a proposed two-stage identification algorithm. Extensive experiments on both simulated and real-world data validate the effectiveness of our method.}
}

Endnote

%0 Conference Paper
%T Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations
%A Zhengming Chen
%A Yewei Xia
%A Feng Xie
%A Jie Qiao
%A Zhifeng Hao
%A Ruichu Cai
%A Kun Zhang
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-chen25bv
%I PMLR
%P 9415--9442
%U https://proceedings.mlr.press/v267/chen25bv.html
%V 267
%X We study the problem of learning discrete latent variable causal structures from mixed-type observational data. Traditional methods, such as those based on the tensor rank condition, are designed to identify discrete latent structure models and provide robust identification bounds for discrete causal models. However, when observed variables—specifically, those representing the children of latent variables—are collected at various levels with continuous data types, the tensor rank condition is not applicable, limiting further causal structure learning for latent variables. In this paper, we consider a more general case where observed variables can be either continuous or discrete, and further allow for scenarios where multiple latent parents cause the same set of observed variables. We show that, under the completeness condition, it is possible to discretize the data in a way that satisfies the full-rank assumption required by the tensor rank condition. This enables the identifiability of discrete latent structure models within mixed-type observational data. Moreover, we introduce the two-sufficient measurement condition, a more general structural assumption under which the tensor rank condition holds and the underlying latent causal structure is identifiable by a proposed two-stage identification algorithm. Extensive experiments on both simulated and real-world data validate the effectiveness of our method.

APA

Chen, Z., Xia, Y., Xie, F., Qiao, J., Hao, Z., Cai, R. & Zhang, K.. (2025). Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:9415-9442 Available from https://proceedings.mlr.press/v267/chen25bv.html.

Identification of Latent Confounders via Investigating the Tensor Ranks of the Nonlinear Observations

Abstract

Cite this Paper

Related Material