On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery

Fateme Jamshidi, Luca Ganassali, Negar Kiyavash
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:21225-21240, 2024.

Abstract

Motivated by conditional independence testing, an essential step in constraint-based causal discovery algorithms, we study the nonparametric Von Mises estimator for the entropy of multivariate distributions built on a kernel density estimator. We establish an exponential concentration inequality for this estimator. We design a test for conditional independence (CI) based on our estimator, called VM-CI, which achieves optimal parametric rates under smoothness assumptions. Leveraging the exponential concentration, we prove a tight upper bound for the overall error of VM-CI. This, in turn, allows us to characterize the sample complexity of any constraint-based causal discovery algorithm that uses VM-CI for CI tests. To the best of our knowledge, this is the first sample complexity guarantee for causal discovery for non-linear models and non-Gaussian continuous variables. Furthermore, we empirically show that VM-CI outperforms other popular CI tests in terms of either time, sample complexity, or both. This enhancement significantly improves the performance in structure learning as well.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-jamshidi24a, title = {On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery}, author = {Jamshidi, Fateme and Ganassali, Luca and Kiyavash, Negar}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {21225--21240}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/jamshidi24a/jamshidi24a.pdf}, url = {https://proceedings.mlr.press/v235/jamshidi24a.html}, abstract = {Motivated by conditional independence testing, an essential step in constraint-based causal discovery algorithms, we study the nonparametric Von Mises estimator for the entropy of multivariate distributions built on a kernel density estimator. We establish an exponential concentration inequality for this estimator. We design a test for conditional independence (CI) based on our estimator, called VM-CI, which achieves optimal parametric rates under smoothness assumptions. Leveraging the exponential concentration, we prove a tight upper bound for the overall error of VM-CI. This, in turn, allows us to characterize the sample complexity of any constraint-based causal discovery algorithm that uses VM-CI for CI tests. To the best of our knowledge, this is the first sample complexity guarantee for causal discovery for non-linear models and non-Gaussian continuous variables. Furthermore, we empirically show that VM-CI outperforms other popular CI tests in terms of either time, sample complexity, or both. This enhancement significantly improves the performance in structure learning as well.} }
Endnote
%0 Conference Paper %T On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery %A Fateme Jamshidi %A Luca Ganassali %A Negar Kiyavash %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-jamshidi24a %I PMLR %P 21225--21240 %U https://proceedings.mlr.press/v235/jamshidi24a.html %V 235 %X Motivated by conditional independence testing, an essential step in constraint-based causal discovery algorithms, we study the nonparametric Von Mises estimator for the entropy of multivariate distributions built on a kernel density estimator. We establish an exponential concentration inequality for this estimator. We design a test for conditional independence (CI) based on our estimator, called VM-CI, which achieves optimal parametric rates under smoothness assumptions. Leveraging the exponential concentration, we prove a tight upper bound for the overall error of VM-CI. This, in turn, allows us to characterize the sample complexity of any constraint-based causal discovery algorithm that uses VM-CI for CI tests. To the best of our knowledge, this is the first sample complexity guarantee for causal discovery for non-linear models and non-Gaussian continuous variables. Furthermore, we empirically show that VM-CI outperforms other popular CI tests in terms of either time, sample complexity, or both. This enhancement significantly improves the performance in structure learning as well.
APA
Jamshidi, F., Ganassali, L. & Kiyavash, N.. (2024). On the sample complexity of conditional independence testing with Von Mises estimator with application to causal discovery. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:21225-21240 Available from https://proceedings.mlr.press/v235/jamshidi24a.html.

Related Material