Strong and Weak Identifiability of Optimization-based Causal Discovery in Non-linear Additive Noise Models

Mingjia Li, Hong Qian, Tian-Zuo Wang, Shujun Li, Min Zhang, Aimin Zhou
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:35753-35768, 2025.

Abstract

Causal discovery aims to identify causal relationships from observational data. Recently, optimization-based causal discovery methods have attracted extensive attention in the literature due to their efficiency in handling high-dimensional problems. However, we observe that optimization-based methods often perform well on certain problems but struggle with others. This paper identifies a specific characteristic of causal structural equations that determines the difficulty of identification in causal discovery and, in turn, the performance of optimization-based methods. We conduct an in-depth study of the additive noise model (ANM) and propose to further divide identifiable problems into strongly and weakly identifiable types based on the difficulty of identification. We also provide a sufficient condition to distinguish the two categories. Inspired by these findings, this paper further proposes GENE, a generic method for addressing strongly and weakly identifiable problems in a unified way under the ANM assumption. GENE adopts an order-based search framework that incorporates conditional independence tests into order fitness evaluation, ensuring effectiveness on weakly identifiable problems. In addition, GENE restricts the dimensionality of the effect variables to ensure scale invariance, a property crucial for practical applications. Experiments demonstrate that GENE is uniquely effective in addressing weakly identifiable problems while also remaining competitive with state-of-the-art causal discovery algorithms for strongly identifiable problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-li25bx, title = {Strong and Weak Identifiability of Optimization-based Causal Discovery in Non-linear Additive Noise Models}, author = {Li, Mingjia and Qian, Hong and Wang, Tian-Zuo and Li, Shujun and Zhang, Min and Zhou, Aimin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {35753--35768}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/li25bx/li25bx.pdf}, url = {https://proceedings.mlr.press/v267/li25bx.html}, abstract = {Causal discovery aims to identify causal relationships from observational data. Recently, optimization-based causal discovery methods have attracted extensive attention in the literature due to their efficiency in handling high-dimensional problems. However, we observe that optimization-based methods often perform well on certain problems but struggle with others. This paper identifies a specific characteristic of causal structural equations that determines the difficulty of identification in causal discovery and, in turn, the performance of optimization-based methods. We conduct an in-depth study of the additive noise model (ANM) and propose to further divide identifiable problems into strongly and weakly identifiable types based on the difficulty of identification. We also provide a sufficient condition to distinguish the two categories. Inspired by these findings, this paper further proposes GENE, a generic method for addressing strongly and weakly identifiable problems in a unified way under the ANM assumption. GENE adopts an order-based search framework that incorporates conditional independence tests into order fitness evaluation, ensuring effectiveness on weakly identifiable problems. In addition, GENE restricts the dimensionality of the effect variables to ensure scale invariance, a property crucial for practical applications. Experiments demonstrate that GENE is uniquely effective in addressing weakly identifiable problems while also remaining competitive with state-of-the-art causal discovery algorithms for strongly identifiable problems.} }
Endnote
%0 Conference Paper %T Strong and Weak Identifiability of Optimization-based Causal Discovery in Non-linear Additive Noise Models %A Mingjia Li %A Hong Qian %A Tian-Zuo Wang %A Shujun Li %A Min Zhang %A Aimin Zhou %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-li25bx %I PMLR %P 35753--35768 %U https://proceedings.mlr.press/v267/li25bx.html %V 267 %X Causal discovery aims to identify causal relationships from observational data. Recently, optimization-based causal discovery methods have attracted extensive attention in the literature due to their efficiency in handling high-dimensional problems. However, we observe that optimization-based methods often perform well on certain problems but struggle with others. This paper identifies a specific characteristic of causal structural equations that determines the difficulty of identification in causal discovery and, in turn, the performance of optimization-based methods. We conduct an in-depth study of the additive noise model (ANM) and propose to further divide identifiable problems into strongly and weakly identifiable types based on the difficulty of identification. We also provide a sufficient condition to distinguish the two categories. Inspired by these findings, this paper further proposes GENE, a generic method for addressing strongly and weakly identifiable problems in a unified way under the ANM assumption. GENE adopts an order-based search framework that incorporates conditional independence tests into order fitness evaluation, ensuring effectiveness on weakly identifiable problems. In addition, GENE restricts the dimensionality of the effect variables to ensure scale invariance, a property crucial for practical applications. Experiments demonstrate that GENE is uniquely effective in addressing weakly identifiable problems while also remaining competitive with state-of-the-art causal discovery algorithms for strongly identifiable problems.
APA
Li, M., Qian, H., Wang, T., Li, S., Zhang, M. & Zhou, A.. (2025). Strong and Weak Identifiability of Optimization-based Causal Discovery in Non-linear Additive Noise Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:35753-35768 Available from https://proceedings.mlr.press/v267/li25bx.html.

Related Material