A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Zhangyang Gao; Daize Dong; Cheng Tan; Jun Xia; Bozhen Hu; Stan Z. Li

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer

Zhangyang Gao, Daize Dong, Cheng Tan, Jun Xia, Bozhen Hu, Stan Z. Li

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:14681-14701, 2024.

Abstract

Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on

$100$ M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on

$8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at https://github.com/A4Bio/GraphsGPT.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-gao24e,
  title = 	 {A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer},
  author =       {Gao, Zhangyang and Dong, Daize and Tan, Cheng and Xia, Jun and Hu, Bozhen and Li, Stan Z.},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {14681--14701},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/gao24e/gao24e.pdf},
  url = 	 {https://proceedings.mlr.press/v235/gao24e.html},
  abstract = 	 {Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at https://github.com/A4Bio/GraphsGPT.}
}

Endnote

%0 Conference Paper
%T A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer
%A Zhangyang Gao
%A Daize Dong
%A Cheng Tan
%A Jun Xia
%A Bozhen Hu
%A Stan Z. Li
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-gao24e
%I PMLR
%P 14681--14701
%U https://proceedings.mlr.press/v235/gao24e.html
%V 235
%X Can we model Non-Euclidean graphs as pure language or even Euclidean vectors while retaining their inherent information? The Non-Euclidean property have posed a long term challenge in graph modeling. Despite recent graph neural networks and graph transformers efforts encoding graphs as Euclidean vectors, recovering the original graph from vectors remains a challenge. In this paper, we introduce GraphsGPT, featuring an Graph2Seq encoder that transforms Non-Euclidean graphs into learnable Graph Words in the Euclidean space, along with a GraphGPT decoder that reconstructs the original graph from Graph Words to ensure information equivalence. We pretrain GraphsGPT on $100$M molecules and yield some interesting findings: (1) The pretrained Graph2Seq excels in graph representation learning, achieving state-of-the-art results on $8/9$ graph classification and regression tasks. (2) The pretrained GraphGPT serves as a strong graph generator, demonstrated by its strong ability to perform both few-shot and conditional graph generation. (3) Graph2Seq+GraphGPT enables effective graph mixup in the Euclidean space, overcoming previously known Non-Euclidean challenges. (4) The edge-centric pretraining framework GraphsGPT demonstrates its efficacy in graph domain tasks, excelling in both representation and generation. Code is available at https://github.com/A4Bio/GraphsGPT.

APA


Gao, Z., Dong, D., Tan, C., Xia, J., Hu, B. & Li, S.Z.. (2024). A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:14681-14701 Available from https://proceedings.mlr.press/v235/gao24e.html.

A Graph is Worth KK Words: Euclideanizing Graph using Pure Transformer

Abstract

Cite this Paper

Related Material

A Graph is Worth $K$ Words: Euclideanizing Graph using Pure Transformer