Learning Modular Structures from Network Data and Node Variables

Elham Azizi, Edoardo Airoldi, James Galagan
; Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1440-1448, 2014.

Abstract

A standard technique for understanding underlying dependency structures among a set of variables posits a shared conditional probability distribution for the variables measured on individuals within a group. This approach is often referred to as module networks, where individuals are represented by nodes in a network, groups are termed modules, and the focus is on estimating the network structure among modules. However, estimation solely from node-specific variables can lead to spurious dependencies, and unverifiable structural assumptions are often used for regularization. Here, we propose an extended model that leverages direct observations about the network in addition to node-specific variables. By integrating complementary data types, we avoid the need for structural assumptions. We illustrate theoretical and practical significance of the model and develop a reversible-jump MCMC learning procedure for learning modules and model parameters. We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn regulatory modules in the Mycobacterium tuberculosis gene regulatory network.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-azizi14, title = {Learning Modular Structures from Network Data and Node Variables}, author = {Elham Azizi and Edoardo Airoldi and James Galagan}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {1440--1448}, year = {2014}, editor = {Eric P. Xing and Tony Jebara}, volume = {32}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/azizi14.pdf}, url = {http://proceedings.mlr.press/v32/azizi14.html}, abstract = {A standard technique for understanding underlying dependency structures among a set of variables posits a shared conditional probability distribution for the variables measured on individuals within a group. This approach is often referred to as module networks, where individuals are represented by nodes in a network, groups are termed modules, and the focus is on estimating the network structure among modules. However, estimation solely from node-specific variables can lead to spurious dependencies, and unverifiable structural assumptions are often used for regularization. Here, we propose an extended model that leverages direct observations about the network in addition to node-specific variables. By integrating complementary data types, we avoid the need for structural assumptions. We illustrate theoretical and practical significance of the model and develop a reversible-jump MCMC learning procedure for learning modules and model parameters. We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn regulatory modules in the Mycobacterium tuberculosis gene regulatory network.} }
Endnote
%0 Conference Paper %T Learning Modular Structures from Network Data and Node Variables %A Elham Azizi %A Edoardo Airoldi %A James Galagan %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-azizi14 %I PMLR %J Proceedings of Machine Learning Research %P 1440--1448 %U http://proceedings.mlr.press %V 32 %N 2 %W PMLR %X A standard technique for understanding underlying dependency structures among a set of variables posits a shared conditional probability distribution for the variables measured on individuals within a group. This approach is often referred to as module networks, where individuals are represented by nodes in a network, groups are termed modules, and the focus is on estimating the network structure among modules. However, estimation solely from node-specific variables can lead to spurious dependencies, and unverifiable structural assumptions are often used for regularization. Here, we propose an extended model that leverages direct observations about the network in addition to node-specific variables. By integrating complementary data types, we avoid the need for structural assumptions. We illustrate theoretical and practical significance of the model and develop a reversible-jump MCMC learning procedure for learning modules and model parameters. We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn regulatory modules in the Mycobacterium tuberculosis gene regulatory network.
RIS
TY - CPAPER TI - Learning Modular Structures from Network Data and Node Variables AU - Elham Azizi AU - Edoardo Airoldi AU - James Galagan BT - Proceedings of the 31st International Conference on Machine Learning PY - 2014/01/27 DA - 2014/01/27 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-azizi14 PB - PMLR SP - 1440 DP - PMLR EP - 1448 L1 - http://proceedings.mlr.press/v32/azizi14.pdf UR - http://proceedings.mlr.press/v32/azizi14.html AB - A standard technique for understanding underlying dependency structures among a set of variables posits a shared conditional probability distribution for the variables measured on individuals within a group. This approach is often referred to as module networks, where individuals are represented by nodes in a network, groups are termed modules, and the focus is on estimating the network structure among modules. However, estimation solely from node-specific variables can lead to spurious dependencies, and unverifiable structural assumptions are often used for regularization. Here, we propose an extended model that leverages direct observations about the network in addition to node-specific variables. By integrating complementary data types, we avoid the need for structural assumptions. We illustrate theoretical and practical significance of the model and develop a reversible-jump MCMC learning procedure for learning modules and model parameters. We demonstrate the method accuracy in predicting modular structures from synthetic data and capability to learn regulatory modules in the Mycobacterium tuberculosis gene regulatory network. ER -
APA
Azizi, E., Airoldi, E. & Galagan, J.. (2014). Learning Modular Structures from Network Data and Node Variables. Proceedings of the 31st International Conference on Machine Learning, in PMLR 32(2):1440-1448

Related Material