Structure of Nonlinear Node Embeddings in Stochastic Block Models

Christopher Harker, Aditya Bhaskara
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:6764-6782, 2023.

Abstract

Nonlinear node embedding techniques such as DeepWalk and Node2Vec are used extensively in practice to uncover structure in graphs. Despite theoretical guarantees in special regimes (such as the case of high embedding dimension), the structure of the optimal low dimensional embeddings has not been formally understood even for graphs obtained from simple generative models. We consider the stochastic block model and show that under appropriate separation conditions, the optimal embeddings can be analytically characterized. Akin to known results on eigenvector based (spectral) embeddings, we prove theoretically that solution vectors are well-clustered, up to a sublinear error.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-harker23a, title = {Structure of Nonlinear Node Embeddings in Stochastic Block Models}, author = {Harker, Christopher and Bhaskara, Aditya}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {6764--6782}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/harker23a/harker23a.pdf}, url = {https://proceedings.mlr.press/v206/harker23a.html}, abstract = {Nonlinear node embedding techniques such as DeepWalk and Node2Vec are used extensively in practice to uncover structure in graphs. Despite theoretical guarantees in special regimes (such as the case of high embedding dimension), the structure of the optimal low dimensional embeddings has not been formally understood even for graphs obtained from simple generative models. We consider the stochastic block model and show that under appropriate separation conditions, the optimal embeddings can be analytically characterized. Akin to known results on eigenvector based (spectral) embeddings, we prove theoretically that solution vectors are well-clustered, up to a sublinear error.} }
Endnote
%0 Conference Paper %T Structure of Nonlinear Node Embeddings in Stochastic Block Models %A Christopher Harker %A Aditya Bhaskara %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-harker23a %I PMLR %P 6764--6782 %U https://proceedings.mlr.press/v206/harker23a.html %V 206 %X Nonlinear node embedding techniques such as DeepWalk and Node2Vec are used extensively in practice to uncover structure in graphs. Despite theoretical guarantees in special regimes (such as the case of high embedding dimension), the structure of the optimal low dimensional embeddings has not been formally understood even for graphs obtained from simple generative models. We consider the stochastic block model and show that under appropriate separation conditions, the optimal embeddings can be analytically characterized. Akin to known results on eigenvector based (spectral) embeddings, we prove theoretically that solution vectors are well-clustered, up to a sublinear error.
APA
Harker, C. & Bhaskara, A.. (2023). Structure of Nonlinear Node Embeddings in Stochastic Block Models. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:6764-6782 Available from https://proceedings.mlr.press/v206/harker23a.html.

Related Material