Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

Jiaming Xu, Laurent Massoulié, Marc Lelarge
Proceedings of The 27th Conference on Learning Theory, PMLR 35:903-920, 2014.

Abstract

The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution.

Cite this Paper


BibTeX
@InProceedings{pmlr-v35-xu14, title = {Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results}, author = {Xu, Jiaming and Massoulié, Laurent and Lelarge, Marc}, booktitle = {Proceedings of The 27th Conference on Learning Theory}, pages = {903--920}, year = {2014}, editor = {Balcan, Maria Florina and Feldman, Vitaly and Szepesvári, Csaba}, volume = {35}, series = {Proceedings of Machine Learning Research}, address = {Barcelona, Spain}, month = {13--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v35/xu14.pdf}, url = {https://proceedings.mlr.press/v35/xu14.html}, abstract = {The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution. } }
Endnote
%0 Conference Paper %T Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results %A Jiaming Xu %A Laurent Massoulié %A Marc Lelarge %B Proceedings of The 27th Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2014 %E Maria Florina Balcan %E Vitaly Feldman %E Csaba Szepesvári %F pmlr-v35-xu14 %I PMLR %P 903--920 %U https://proceedings.mlr.press/v35/xu14.html %V 35 %X The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution.
RIS
TY - CPAPER TI - Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results AU - Jiaming Xu AU - Laurent Massoulié AU - Marc Lelarge BT - Proceedings of The 27th Conference on Learning Theory DA - 2014/05/29 ED - Maria Florina Balcan ED - Vitaly Feldman ED - Csaba Szepesvári ID - pmlr-v35-xu14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 35 SP - 903 EP - 920 L1 - http://proceedings.mlr.press/v35/xu14.pdf UR - https://proceedings.mlr.press/v35/xu14.html AB - The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution. ER -
APA
Xu, J., Massoulié, L. & Lelarge, M.. (2014). Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results. Proceedings of The 27th Conference on Learning Theory, in Proceedings of Machine Learning Research 35:903-920 Available from https://proceedings.mlr.press/v35/xu14.html.

Related Material