Statistically significant subgraphs for genome-wide association study

Jun Sese, Aika Terada, Yuki Saito, Koji Tsuda
Proceedings of the Workshop on Statistically Sound Data Mining at ECML/PKDD, PMLR 47:29-36, 2015.

Abstract

Genome-wide association studies (GWAS) have been widely used for understanding the associations of single-nucleotide polymorphisms (SNPs) with a disease. GWAS data are often combined with known biological networks, and they have been analyzed using graph-mining techniques toward a systems understanding of the biological changes caused by the SNPs. To determine which subgraphs are associated with the disease, a statistical test on each subgraph needs to be conducted. However, no statistically significant results were found because multiple testing correction causes an extremely small corrected significance level. We introduce a method called gLAMP to enumerate subgraphs having statistically significant associations with a diagnosis. gLAMP integrates the Limitless Arity Multiple-testing Procedure (LAMP) with a graph-mining algorithm called COmmon Itemset Network mining (COIN). LAMP gives us the smallest possible Bonferroni factor, and COIN provides us with efficient enumeration of testable subgraphs. Theoretical results of their combination show the potential to enumerate subgraphs statistically significantly associated with a disease.

Cite this Paper


BibTeX
@InProceedings{pmlr-v47-sese14a, title = {Statistically significant subgraphs for genome-wide association study}, author = {Sese, Jun and Terada, Aika and Saito, Yuki and Tsuda, Koji}, booktitle = {Proceedings of the Workshop on Statistically Sound Data Mining at ECML/PKDD}, pages = {29--36}, year = {2015}, editor = {Hämäläinen, Wilhelmiina and Petitjean, François and Webb, I.}, volume = {47}, series = {Proceedings of Machine Learning Research}, address = {Nancy, France}, month = {15 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v47/sese14a.pdf}, url = {https://proceedings.mlr.press/v47/sese14a.html}, abstract = {Genome-wide association studies (GWAS) have been widely used for understanding the associations of single-nucleotide polymorphisms (SNPs) with a disease. GWAS data are often combined with known biological networks, and they have been analyzed using graph-mining techniques toward a systems understanding of the biological changes caused by the SNPs. To determine which subgraphs are associated with the disease, a statistical test on each subgraph needs to be conducted. However, no statistically significant results were found because multiple testing correction causes an extremely small corrected significance level. We introduce a method called gLAMP to enumerate subgraphs having statistically significant associations with a diagnosis. gLAMP integrates the Limitless Arity Multiple-testing Procedure (LAMP) with a graph-mining algorithm called COmmon Itemset Network mining (COIN). LAMP gives us the smallest possible Bonferroni factor, and COIN provides us with efficient enumeration of testable subgraphs. Theoretical results of their combination show the potential to enumerate subgraphs statistically significantly associated with a disease.} }
Endnote
%0 Conference Paper %T Statistically significant subgraphs for genome-wide association study %A Jun Sese %A Aika Terada %A Yuki Saito %A Koji Tsuda %B Proceedings of the Workshop on Statistically Sound Data Mining at ECML/PKDD %C Proceedings of Machine Learning Research %D 2015 %E Wilhelmiina Hämäläinen %E François Petitjean %E I. Webb %F pmlr-v47-sese14a %I PMLR %P 29--36 %U https://proceedings.mlr.press/v47/sese14a.html %V 47 %X Genome-wide association studies (GWAS) have been widely used for understanding the associations of single-nucleotide polymorphisms (SNPs) with a disease. GWAS data are often combined with known biological networks, and they have been analyzed using graph-mining techniques toward a systems understanding of the biological changes caused by the SNPs. To determine which subgraphs are associated with the disease, a statistical test on each subgraph needs to be conducted. However, no statistically significant results were found because multiple testing correction causes an extremely small corrected significance level. We introduce a method called gLAMP to enumerate subgraphs having statistically significant associations with a diagnosis. gLAMP integrates the Limitless Arity Multiple-testing Procedure (LAMP) with a graph-mining algorithm called COmmon Itemset Network mining (COIN). LAMP gives us the smallest possible Bonferroni factor, and COIN provides us with efficient enumeration of testable subgraphs. Theoretical results of their combination show the potential to enumerate subgraphs statistically significantly associated with a disease.
RIS
TY - CPAPER TI - Statistically significant subgraphs for genome-wide association study AU - Jun Sese AU - Aika Terada AU - Yuki Saito AU - Koji Tsuda BT - Proceedings of the Workshop on Statistically Sound Data Mining at ECML/PKDD DA - 2015/11/27 ED - Wilhelmiina Hämäläinen ED - François Petitjean ED - I. Webb ID - pmlr-v47-sese14a PB - PMLR DP - Proceedings of Machine Learning Research VL - 47 SP - 29 EP - 36 L1 - http://proceedings.mlr.press/v47/sese14a.pdf UR - https://proceedings.mlr.press/v47/sese14a.html AB - Genome-wide association studies (GWAS) have been widely used for understanding the associations of single-nucleotide polymorphisms (SNPs) with a disease. GWAS data are often combined with known biological networks, and they have been analyzed using graph-mining techniques toward a systems understanding of the biological changes caused by the SNPs. To determine which subgraphs are associated with the disease, a statistical test on each subgraph needs to be conducted. However, no statistically significant results were found because multiple testing correction causes an extremely small corrected significance level. We introduce a method called gLAMP to enumerate subgraphs having statistically significant associations with a diagnosis. gLAMP integrates the Limitless Arity Multiple-testing Procedure (LAMP) with a graph-mining algorithm called COmmon Itemset Network mining (COIN). LAMP gives us the smallest possible Bonferroni factor, and COIN provides us with efficient enumeration of testable subgraphs. Theoretical results of their combination show the potential to enumerate subgraphs statistically significantly associated with a disease. ER -
APA
Sese, J., Terada, A., Saito, Y. & Tsuda, K.. (2015). Statistically significant subgraphs for genome-wide association study. Proceedings of the Workshop on Statistically Sound Data Mining at ECML/PKDD, in Proceedings of Machine Learning Research 47:29-36 Available from https://proceedings.mlr.press/v47/sese14a.html.

Related Material