DeepWalking Backwards: From Embeddings Back to Graphs

Sudhanshu Chanpuriya; Cameron Musco; Konstantinos Sotiropoulos; Charalampos Tsourakakis

DeepWalking Backwards: From Embeddings Back to Graphs

Sudhanshu Chanpuriya, Cameron Musco, Konstantinos Sotiropoulos, Charalampos Tsourakakis

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1473-1483, 2021.

Abstract

Low-dimensional node embeddings play a key role in analyzing graph datasets. However, little work studies exactly what information is encoded by popular embedding methods, and how this information correlates with performance in downstream learning tasks. We tackle this question by studying whether embeddings can be inverted to (approximately) recover the graph used to generate them. Focusing on a variant of the popular DeepWalk method \cite{PerozziAl-RfouSkiena:2014, QiuDongMa:2018}, we present algorithms for accurate embedding inversion – i.e., from the low-dimensional embedding of a graph $G$, we can find a graph $\tilde G$ with a very similar embedding. We perform numerous experiments on real-world networks, observing that significant information about $G$, such as specific edges and bulk properties like triangle density, is often lost in $\tilde G$. However, community structure is often preserved or even enhanced. Our findings are a step towards a more rigorous understanding of exactly what information embeddings encode about the input graph, and why this information is useful for learning tasks.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-chanpuriya21a,
  title = 	 {DeepWalking Backwards: From Embeddings Back to Graphs},
  author =       {Chanpuriya, Sudhanshu and Musco, Cameron and Sotiropoulos, Konstantinos and Tsourakakis, Charalampos},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {1473--1483},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/chanpuriya21a/chanpuriya21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/chanpuriya21a.html},
  abstract = 	 {Low-dimensional node embeddings play a key role in analyzing graph datasets. However, little work studies exactly what information is encoded by popular embedding methods, and how this information correlates with performance in downstream learning tasks. We tackle this question by studying whether embeddings can be inverted to (approximately) recover the graph used to generate them. Focusing on a variant of the popular DeepWalk method \cite{PerozziAl-RfouSkiena:2014, QiuDongMa:2018}, we present algorithms for accurate embedding inversion – i.e., from the low-dimensional embedding of a graph $G$, we can find a graph $\tilde G$ with a very similar embedding. We perform numerous experiments on real-world networks, observing that significant information about $G$, such as specific edges and bulk properties like triangle density, is often lost in $\tilde G$. However, community structure is often preserved or even enhanced. Our findings are a step towards a more rigorous understanding of exactly what information embeddings encode about the input graph, and why this information is useful for learning tasks.}
}

Endnote

%0 Conference Paper
%T DeepWalking Backwards: From Embeddings Back to Graphs
%A Sudhanshu Chanpuriya
%A Cameron Musco
%A Konstantinos Sotiropoulos
%A Charalampos Tsourakakis
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-chanpuriya21a
%I PMLR
%P 1473--1483
%U https://proceedings.mlr.press/v139/chanpuriya21a.html
%V 139
%X Low-dimensional node embeddings play a key role in analyzing graph datasets. However, little work studies exactly what information is encoded by popular embedding methods, and how this information correlates with performance in downstream learning tasks. We tackle this question by studying whether embeddings can be inverted to (approximately) recover the graph used to generate them. Focusing on a variant of the popular DeepWalk method \cite{PerozziAl-RfouSkiena:2014, QiuDongMa:2018}, we present algorithms for accurate embedding inversion – i.e., from the low-dimensional embedding of a graph $G$, we can find a graph $\tilde G$ with a very similar embedding. We perform numerous experiments on real-world networks, observing that significant information about $G$, such as specific edges and bulk properties like triangle density, is often lost in $\tilde G$. However, community structure is often preserved or even enhanced. Our findings are a step towards a more rigorous understanding of exactly what information embeddings encode about the input graph, and why this information is useful for learning tasks.

APA

Chanpuriya, S., Musco, C., Sotiropoulos, K. & Tsourakakis, C.. (2021). DeepWalking Backwards: From Embeddings Back to Graphs. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1473-1483 Available from https://proceedings.mlr.press/v139/chanpuriya21a.html.

DeepWalking Backwards: From Embeddings Back to Graphs

Abstract

Cite this Paper

Related Material