Sample Complexity of Nonparametric Closeness Testing for Continuous Distributions and Its Application to Causal Discovery with Hidden Confounding

Fateme Jamshidi, Sina Akbari, Negar Kiyavash
Proceedings of the Fourth Conference on Causal Learning and Reasoning, PMLR 275:1217-1238, 2025.

Abstract

We study the problem of closeness testing for continuous distributions and its implications for causal discovery. Specifically, we analyze the sample complexity of distinguishing whether two multidimensional continuous distributions are identical or differ by at least $\epsilon$ in terms of Kullback-Leibler (KL) divergence under non-parametric assumptions. To this end, we propose an estimator of KL divergence which is based on the von Mises expansion. Our closeness test attains optimal parametric rates under smoothness assumptions. Equipped with this test, which serves as a building block of our causal discovery algorithm to identify the causal structure between two multidimensional random variables, we establish sample complexity guarantees for our causal discovery method. To the best of our knowledge, this work is the first work that provides sample complexity guarantees for distinguishing cause and effect in multidimensional non-linear models with non-Gaussian continuous variables in the presence of unobserved confounding.

Cite this Paper


BibTeX
@InProceedings{pmlr-v275-jamshidi25a, title = {Sample Complexity of Nonparametric Closeness Testing for Continuous Distributions and Its Application to Causal Discovery with Hidden Confounding}, author = {Jamshidi, Fateme and Akbari, Sina and Kiyavash, Negar}, booktitle = {Proceedings of the Fourth Conference on Causal Learning and Reasoning}, pages = {1217--1238}, year = {2025}, editor = {Huang, Biwei and Drton, Mathias}, volume = {275}, series = {Proceedings of Machine Learning Research}, month = {07--09 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v275/main/assets/jamshidi25a/jamshidi25a.pdf}, url = {https://proceedings.mlr.press/v275/jamshidi25a.html}, abstract = {We study the problem of closeness testing for continuous distributions and its implications for causal discovery. Specifically, we analyze the sample complexity of distinguishing whether two multidimensional continuous distributions are identical or differ by at least $\epsilon$ in terms of Kullback-Leibler (KL) divergence under non-parametric assumptions. To this end, we propose an estimator of KL divergence which is based on the von Mises expansion. Our closeness test attains optimal parametric rates under smoothness assumptions. Equipped with this test, which serves as a building block of our causal discovery algorithm to identify the causal structure between two multidimensional random variables, we establish sample complexity guarantees for our causal discovery method. To the best of our knowledge, this work is the first work that provides sample complexity guarantees for distinguishing cause and effect in multidimensional non-linear models with non-Gaussian continuous variables in the presence of unobserved confounding.} }
Endnote
%0 Conference Paper %T Sample Complexity of Nonparametric Closeness Testing for Continuous Distributions and Its Application to Causal Discovery with Hidden Confounding %A Fateme Jamshidi %A Sina Akbari %A Negar Kiyavash %B Proceedings of the Fourth Conference on Causal Learning and Reasoning %C Proceedings of Machine Learning Research %D 2025 %E Biwei Huang %E Mathias Drton %F pmlr-v275-jamshidi25a %I PMLR %P 1217--1238 %U https://proceedings.mlr.press/v275/jamshidi25a.html %V 275 %X We study the problem of closeness testing for continuous distributions and its implications for causal discovery. Specifically, we analyze the sample complexity of distinguishing whether two multidimensional continuous distributions are identical or differ by at least $\epsilon$ in terms of Kullback-Leibler (KL) divergence under non-parametric assumptions. To this end, we propose an estimator of KL divergence which is based on the von Mises expansion. Our closeness test attains optimal parametric rates under smoothness assumptions. Equipped with this test, which serves as a building block of our causal discovery algorithm to identify the causal structure between two multidimensional random variables, we establish sample complexity guarantees for our causal discovery method. To the best of our knowledge, this work is the first work that provides sample complexity guarantees for distinguishing cause and effect in multidimensional non-linear models with non-Gaussian continuous variables in the presence of unobserved confounding.
APA
Jamshidi, F., Akbari, S. & Kiyavash, N.. (2025). Sample Complexity of Nonparametric Closeness Testing for Continuous Distributions and Its Application to Causal Discovery with Hidden Confounding. Proceedings of the Fourth Conference on Causal Learning and Reasoning, in Proceedings of Machine Learning Research 275:1217-1238 Available from https://proceedings.mlr.press/v275/jamshidi25a.html.

Related Material