Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models

Xichen Guo, Feng Xie, Yan Zeng, Hao Zhang, Zhi Geng
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:21163-21183, 2025.

Abstract

We consider the problem of selecting instrumental variables from observational data, a fundamental challenge in causal inference. Existing methods mostly focus on additive linear, constant effects models, limiting their applicability in complex real-world scenarios. In this paper, we tackle a more general and challenging setting: the additive non-linear, constant effects model. We first propose a novel testable condition, termed the Cross Auxiliary-based independent Test (CAT) condition, for selecting the valid IV set. We show that this condition is both necessary and sufficient for identifying valid instrumental variable sets within such a model under milder assumptions. Building on this condition, we develop a practical algorithm for selecting the set of valid instrumental variables. Extensive experiments on both synthetic and two real-world datasets demonstrate the effectiveness and robustness of our proposed approach, highlighting its potential for broader applications in causal analysis.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-guo25q, title = {Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models}, author = {Guo, Xichen and Xie, Feng and Zeng, Yan and Zhang, Hao and Geng, Zhi}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {21163--21183}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/guo25q/guo25q.pdf}, url = {https://proceedings.mlr.press/v267/guo25q.html}, abstract = {We consider the problem of selecting instrumental variables from observational data, a fundamental challenge in causal inference. Existing methods mostly focus on additive linear, constant effects models, limiting their applicability in complex real-world scenarios. In this paper, we tackle a more general and challenging setting: the additive non-linear, constant effects model. We first propose a novel testable condition, termed the Cross Auxiliary-based independent Test (CAT) condition, for selecting the valid IV set. We show that this condition is both necessary and sufficient for identifying valid instrumental variable sets within such a model under milder assumptions. Building on this condition, we develop a practical algorithm for selecting the set of valid instrumental variables. Extensive experiments on both synthetic and two real-world datasets demonstrate the effectiveness and robustness of our proposed approach, highlighting its potential for broader applications in causal analysis.} }
Endnote
%0 Conference Paper %T Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models %A Xichen Guo %A Feng Xie %A Yan Zeng %A Hao Zhang %A Zhi Geng %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-guo25q %I PMLR %P 21163--21183 %U https://proceedings.mlr.press/v267/guo25q.html %V 267 %X We consider the problem of selecting instrumental variables from observational data, a fundamental challenge in causal inference. Existing methods mostly focus on additive linear, constant effects models, limiting their applicability in complex real-world scenarios. In this paper, we tackle a more general and challenging setting: the additive non-linear, constant effects model. We first propose a novel testable condition, termed the Cross Auxiliary-based independent Test (CAT) condition, for selecting the valid IV set. We show that this condition is both necessary and sufficient for identifying valid instrumental variable sets within such a model under milder assumptions. Building on this condition, we develop a practical algorithm for selecting the set of valid instrumental variables. Extensive experiments on both synthetic and two real-world datasets demonstrate the effectiveness and robustness of our proposed approach, highlighting its potential for broader applications in causal analysis.
APA
Guo, X., Xie, F., Zeng, Y., Zhang, H. & Geng, Z.. (2025). Data-Driven Selection of Instrumental Variables for Additive Nonlinear, Constant Effects Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:21163-21183 Available from https://proceedings.mlr.press/v267/guo25q.html.

Related Material