Bivariate Causal Discovery with Proxy Variables: Integral Solving and Beyond

Yong Wu, Yanwei Fu, Shouyan Wang, Xinwei Sun
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:67202-67259, 2025.

Abstract

Bivariate causal discovery is challenging when unmeasured confounders exist. To adjust for the bias, previous methods employed the proxy variable (i.e., negative control outcome (NCO)) to test the treatment-outcome relationship through integral equations – and assumed that violation of this equation indicates the causal relationship. Upon this, they could establish asymptotic properties for causal hypothesis testing. However, these methods either relied on parametric assumptions or required discretizing continuous variables, which may lead to information loss. Moreover, it is unclear when this underlying integral-related assumption holds, making it difficult to justify the utility in practice. To address these problems, we first consider the scenario where only NCO is available. We propose a novel non-parametric procedure, which enjoys asymptotic properties and preserves more information. Moreover, we find that when NCO affects the outcome, the above integral-related assumption may not hold, rendering the causal relation unidentifiable. Informed by this, we further consider the scenario when the negative control exposure (NCE) is also available. In this scenario, we construct another integral restriction aided by this proxy, which can discover causation when NCO affects the outcome. We demonstrate these findings and the effectiveness of our proposals through comprehensive numerical studies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-wu25h, title = {Bivariate Causal Discovery with Proxy Variables: Integral Solving and Beyond}, author = {Wu, Yong and Fu, Yanwei and Wang, Shouyan and Sun, Xinwei}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {67202--67259}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/wu25h/wu25h.pdf}, url = {https://proceedings.mlr.press/v267/wu25h.html}, abstract = {Bivariate causal discovery is challenging when unmeasured confounders exist. To adjust for the bias, previous methods employed the proxy variable (i.e., negative control outcome (NCO)) to test the treatment-outcome relationship through integral equations – and assumed that violation of this equation indicates the causal relationship. Upon this, they could establish asymptotic properties for causal hypothesis testing. However, these methods either relied on parametric assumptions or required discretizing continuous variables, which may lead to information loss. Moreover, it is unclear when this underlying integral-related assumption holds, making it difficult to justify the utility in practice. To address these problems, we first consider the scenario where only NCO is available. We propose a novel non-parametric procedure, which enjoys asymptotic properties and preserves more information. Moreover, we find that when NCO affects the outcome, the above integral-related assumption may not hold, rendering the causal relation unidentifiable. Informed by this, we further consider the scenario when the negative control exposure (NCE) is also available. In this scenario, we construct another integral restriction aided by this proxy, which can discover causation when NCO affects the outcome. We demonstrate these findings and the effectiveness of our proposals through comprehensive numerical studies.} }
Endnote
%0 Conference Paper %T Bivariate Causal Discovery with Proxy Variables: Integral Solving and Beyond %A Yong Wu %A Yanwei Fu %A Shouyan Wang %A Xinwei Sun %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-wu25h %I PMLR %P 67202--67259 %U https://proceedings.mlr.press/v267/wu25h.html %V 267 %X Bivariate causal discovery is challenging when unmeasured confounders exist. To adjust for the bias, previous methods employed the proxy variable (i.e., negative control outcome (NCO)) to test the treatment-outcome relationship through integral equations – and assumed that violation of this equation indicates the causal relationship. Upon this, they could establish asymptotic properties for causal hypothesis testing. However, these methods either relied on parametric assumptions or required discretizing continuous variables, which may lead to information loss. Moreover, it is unclear when this underlying integral-related assumption holds, making it difficult to justify the utility in practice. To address these problems, we first consider the scenario where only NCO is available. We propose a novel non-parametric procedure, which enjoys asymptotic properties and preserves more information. Moreover, we find that when NCO affects the outcome, the above integral-related assumption may not hold, rendering the causal relation unidentifiable. Informed by this, we further consider the scenario when the negative control exposure (NCE) is also available. In this scenario, we construct another integral restriction aided by this proxy, which can discover causation when NCO affects the outcome. We demonstrate these findings and the effectiveness of our proposals through comprehensive numerical studies.
APA
Wu, Y., Fu, Y., Wang, S. & Sun, X.. (2025). Bivariate Causal Discovery with Proxy Variables: Integral Solving and Beyond. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:67202-67259 Available from https://proceedings.mlr.press/v267/wu25h.html.

Related Material