Inference of nonlinear causal effects with application to TWAS with GWAS summary data

Ben Dai; Chunlin Li; Haoran Xue; Wei Pan; Xiaotong Shen

Inference of nonlinear causal effects with application to TWAS with GWAS summary data

Ben Dai, Chunlin Li, Haoran Xue, Wei Pan, Xiaotong Shen

Proceedings of the Third Conference on Causal Learning and Reasoning, PMLR 236:793-826, 2024.

Abstract

Large-scale genome-wide association studies (GWAS) have offered an exciting opportunity to discover putative causal genes or risk factors associated with diseases by using SNPs as instrumental variables (IVs). However, conventional approaches assume linear causal relations partly for simplicity and partly for the availability of GWAS summary data. In this work, we propose a novel model for transcriptome-wide association studies (TWAS) to incorporate nonlinear relationships across IVs, an exposure/gene, and an outcome, which is robust against violations of the valid IV assumptions, permits the use of GWAS summary data, and covers two-stage least squares (2SLS) as a special case. We decouple the estimation of a marginal causal effect and a nonlinear transformation, where the former is estimated via sliced inverse regression and a sparse instrumental variable regression, and the latter is estimated by a ratio-adjusted inverse regression. On this ground, we propose an inferential procedure. An application of the proposed method to the ADNI gene expression data and the IGAP GWAS summary data identifies 18 causal genes associated with Alzheimer’s disease, including APOE and TOMM40, in addition to 7 other genes missed by 2SLS considering only linear relationships. Our findings suggest that nonlinear modeling is required to unleash the power of IV regression for identifying potentially nonlinear gene-trait associations. The source code and accompanying software *nl-causal* can be accessed through the link: [https://github.com/statmlben/nonlinear-causal](https://github.com/statmlben/nonlinear-causal).

Cite this Paper

BibTeX


@InProceedings{pmlr-v236-dai24a,
  title = 	 {Inference of nonlinear causal effects with application to TWAS with GWAS summary data},
  author =       {Dai, Ben and Li, Chunlin and Xue, Haoran and Pan, Wei and Shen, Xiaotong},
  booktitle = 	 {Proceedings of the Third Conference on Causal Learning and Reasoning},
  pages = 	 {793--826},
  year = 	 {2024},
  editor = 	 {Locatello, Francesco and Didelez, Vanessa},
  volume = 	 {236},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {01--03 Apr},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v236/dai24a/dai24a.pdf},
  url = 	 {https://proceedings.mlr.press/v236/dai24a.html},
  abstract = 	 {Large-scale genome-wide association studies (GWAS) have offered an exciting opportunity to discover putative causal genes or risk factors associated with diseases by using SNPs as instrumental variables (IVs). However, conventional approaches assume linear causal relations partly for simplicity and partly for the availability of GWAS summary data. In this work, we propose a novel model for transcriptome-wide association studies (TWAS) to incorporate nonlinear relationships across IVs, an exposure/gene, and an outcome, which is robust against violations of the valid IV assumptions, permits the use of GWAS summary data, and covers two-stage least squares (2SLS) as a special case.  We decouple the estimation of a marginal causal effect and a nonlinear transformation, where the former is estimated via sliced inverse regression and a sparse instrumental variable regression, and the latter is estimated by a ratio-adjusted inverse regression. On this ground, we propose an inferential procedure. An application of the proposed method to the ADNI gene expression data and the IGAP GWAS summary data identifies 18 causal genes associated with Alzheimer’s disease, including APOE and TOMM40, in addition to 7 other genes missed by 2SLS considering only linear relationships. Our findings suggest that nonlinear modeling is required to unleash the power of IV regression for identifying potentially nonlinear gene-trait associations. The source code and accompanying software *nl-causal* can be accessed through the link: [https://github.com/statmlben/nonlinear-causal](https://github.com/statmlben/nonlinear-causal).}
}

Endnote

%0 Conference Paper
%T Inference of nonlinear causal effects with application to TWAS with GWAS summary data
%A Ben Dai
%A Chunlin Li
%A Haoran Xue
%A Wei Pan
%A Xiaotong Shen
%B Proceedings of the Third Conference on Causal Learning and Reasoning
%C Proceedings of Machine Learning Research
%D 2024
%E Francesco Locatello
%E Vanessa Didelez	
%F pmlr-v236-dai24a
%I PMLR
%P 793--826
%U https://proceedings.mlr.press/v236/dai24a.html
%V 236
%X Large-scale genome-wide association studies (GWAS) have offered an exciting opportunity to discover putative causal genes or risk factors associated with diseases by using SNPs as instrumental variables (IVs). However, conventional approaches assume linear causal relations partly for simplicity and partly for the availability of GWAS summary data. In this work, we propose a novel model for transcriptome-wide association studies (TWAS) to incorporate nonlinear relationships across IVs, an exposure/gene, and an outcome, which is robust against violations of the valid IV assumptions, permits the use of GWAS summary data, and covers two-stage least squares (2SLS) as a special case.  We decouple the estimation of a marginal causal effect and a nonlinear transformation, where the former is estimated via sliced inverse regression and a sparse instrumental variable regression, and the latter is estimated by a ratio-adjusted inverse regression. On this ground, we propose an inferential procedure. An application of the proposed method to the ADNI gene expression data and the IGAP GWAS summary data identifies 18 causal genes associated with Alzheimer’s disease, including APOE and TOMM40, in addition to 7 other genes missed by 2SLS considering only linear relationships. Our findings suggest that nonlinear modeling is required to unleash the power of IV regression for identifying potentially nonlinear gene-trait associations. The source code and accompanying software *nl-causal* can be accessed through the link: [https://github.com/statmlben/nonlinear-causal](https://github.com/statmlben/nonlinear-causal).

APA


Dai, B., Li, C., Xue, H., Pan, W. & Shen, X.. (2024). Inference of nonlinear causal effects with application to TWAS with GWAS summary data. Proceedings of the Third Conference on Causal Learning and Reasoning, in Proceedings of Machine Learning Research 236:793-826 Available from https://proceedings.mlr.press/v236/dai24a.html.

Inference of nonlinear causal effects with application to TWAS with GWAS summary data

Abstract

Cite this Paper

Related Material