GEFL: Extended Filtration Learning for Graph Classification

Simon Zhang, Soham Mukherjee, Tamal K. Dey
Proceedings of the First Learning on Graphs Conference, PMLR 198:16:1-16:26, 2022.

Abstract

Extended persistence is a technique from topological data analysis to obtain global multiscale topological information from a graph. This includes information about connected components and cycles that are captured by the so-called persistence barcodes. We introduce extended persistence into a supervised learning framework for graph classification. Global topological information, in the form of a barcode with four different types of bars and their explicit cycle representatives, is combined into the model by the readout function which is computed by extended persistence. The entire model is end-to-end differentiable. We use a link-cut tree data structure and parallelism to lower the complexity of computing extended persistence, obtaining a speedup of more than 60x over the state-of-the-art for extended persistence computation. This makes extended persistence feasible for machine learning. We show that, under certain conditions, extended persistence surpasses both the WL[1] graph isomorphism test and 0-dimensional barcodes in terms of expressivity because it adds more global (topological) information. In particular, arbitrarily long cycles can be represented, which is difficult for finite receptive field message passing graph neural networks. Furthermore, we show the effectiveness of our method on real world datasets compared to many existing recent graph representation learning methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v198-zhang22b, title = {GEFL: Extended Filtration Learning for Graph Classification}, author = {Zhang, Simon and Mukherjee, Soham and Dey, Tamal K.}, booktitle = {Proceedings of the First Learning on Graphs Conference}, pages = {16:1--16:26}, year = {2022}, editor = {Rieck, Bastian and Pascanu, Razvan}, volume = {198}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v198/zhang22b/zhang22b.pdf}, url = {https://proceedings.mlr.press/v198/zhang22b.html}, abstract = {Extended persistence is a technique from topological data analysis to obtain global multiscale topological information from a graph. This includes information about connected components and cycles that are captured by the so-called persistence barcodes. We introduce extended persistence into a supervised learning framework for graph classification. Global topological information, in the form of a barcode with four different types of bars and their explicit cycle representatives, is combined into the model by the readout function which is computed by extended persistence. The entire model is end-to-end differentiable. We use a link-cut tree data structure and parallelism to lower the complexity of computing extended persistence, obtaining a speedup of more than 60x over the state-of-the-art for extended persistence computation. This makes extended persistence feasible for machine learning. We show that, under certain conditions, extended persistence surpasses both the WL[1] graph isomorphism test and 0-dimensional barcodes in terms of expressivity because it adds more global (topological) information. In particular, arbitrarily long cycles can be represented, which is difficult for finite receptive field message passing graph neural networks. Furthermore, we show the effectiveness of our method on real world datasets compared to many existing recent graph representation learning methods.} }
Endnote
%0 Conference Paper %T GEFL: Extended Filtration Learning for Graph Classification %A Simon Zhang %A Soham Mukherjee %A Tamal K. Dey %B Proceedings of the First Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2022 %E Bastian Rieck %E Razvan Pascanu %F pmlr-v198-zhang22b %I PMLR %P 16:1--16:26 %U https://proceedings.mlr.press/v198/zhang22b.html %V 198 %X Extended persistence is a technique from topological data analysis to obtain global multiscale topological information from a graph. This includes information about connected components and cycles that are captured by the so-called persistence barcodes. We introduce extended persistence into a supervised learning framework for graph classification. Global topological information, in the form of a barcode with four different types of bars and their explicit cycle representatives, is combined into the model by the readout function which is computed by extended persistence. The entire model is end-to-end differentiable. We use a link-cut tree data structure and parallelism to lower the complexity of computing extended persistence, obtaining a speedup of more than 60x over the state-of-the-art for extended persistence computation. This makes extended persistence feasible for machine learning. We show that, under certain conditions, extended persistence surpasses both the WL[1] graph isomorphism test and 0-dimensional barcodes in terms of expressivity because it adds more global (topological) information. In particular, arbitrarily long cycles can be represented, which is difficult for finite receptive field message passing graph neural networks. Furthermore, we show the effectiveness of our method on real world datasets compared to many existing recent graph representation learning methods.
APA
Zhang, S., Mukherjee, S. & Dey, T.K.. (2022). GEFL: Extended Filtration Learning for Graph Classification. Proceedings of the First Learning on Graphs Conference, in Proceedings of Machine Learning Research 198:16:1-16:26 Available from https://proceedings.mlr.press/v198/zhang22b.html.

Related Material