TopoImb: Toward Topology-Level Imbalance in Learning From Graphs

Tianxiang Zhao, Dongsheng Luo, Xiang Zhang, Suhang Wang
Proceedings of the First Learning on Graphs Conference, PMLR 198:37:1-37:18, 2022.

Abstract

Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance. In this work, we argue that for graphs, the imbalance is likely to exist at the sub-class topology group level. Due to the flexibility of topology structures, graphs could be highly diverse, and learning a generalizable classification boundary would be difficult. Therefore, several majority topology groups may dominate the learning process, rendering others under-represented. To address this problem, we propose a new framework {\method} and design (1 a topology extractor, which automatically identifies the topology group for each instance with explicit memory cells, (2 a training modulator, which modulates the learning process of the target GNN model to prevent the case of topology-group-wise under-representation. {\method} can be used as a key component in GNN models to improve their performances under the data imbalance setting. Analyses on both topology-level imbalance and the proposed {\method} are provided theoretically, and we empirically verify its effectiveness with both node-level and graph-level classification as the target tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v198-zhao22b, title = {TopoImb: Toward Topology-Level Imbalance in Learning From Graphs}, author = {Zhao, Tianxiang and Luo, Dongsheng and Zhang, Xiang and Wang, Suhang}, booktitle = {Proceedings of the First Learning on Graphs Conference}, pages = {37:1--37:18}, year = {2022}, editor = {Rieck, Bastian and Pascanu, Razvan}, volume = {198}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v198/zhao22b/zhao22b.pdf}, url = {https://proceedings.mlr.press/v198/zhao22b.html}, abstract = {Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance. In this work, we argue that for graphs, the imbalance is likely to exist at the sub-class topology group level. Due to the flexibility of topology structures, graphs could be highly diverse, and learning a generalizable classification boundary would be difficult. Therefore, several majority topology groups may dominate the learning process, rendering others under-represented. To address this problem, we propose a new framework {\method} and design (1 a topology extractor, which automatically identifies the topology group for each instance with explicit memory cells, (2 a training modulator, which modulates the learning process of the target GNN model to prevent the case of topology-group-wise under-representation. {\method} can be used as a key component in GNN models to improve their performances under the data imbalance setting. Analyses on both topology-level imbalance and the proposed {\method} are provided theoretically, and we empirically verify its effectiveness with both node-level and graph-level classification as the target tasks.} }
Endnote
%0 Conference Paper %T TopoImb: Toward Topology-Level Imbalance in Learning From Graphs %A Tianxiang Zhao %A Dongsheng Luo %A Xiang Zhang %A Suhang Wang %B Proceedings of the First Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2022 %E Bastian Rieck %E Razvan Pascanu %F pmlr-v198-zhao22b %I PMLR %P 37:1--37:18 %U https://proceedings.mlr.press/v198/zhao22b.html %V 198 %X Graph serves as a powerful tool for modeling data that has an underlying structure in non-Euclidean space, by encoding relations as edges and entities as nodes. Despite developments in learning from graph-structured data over the years, one obstacle persists: graph imbalance. Although several attempts have been made to target this problem, they are limited to considering only class-level imbalance. In this work, we argue that for graphs, the imbalance is likely to exist at the sub-class topology group level. Due to the flexibility of topology structures, graphs could be highly diverse, and learning a generalizable classification boundary would be difficult. Therefore, several majority topology groups may dominate the learning process, rendering others under-represented. To address this problem, we propose a new framework {\method} and design (1 a topology extractor, which automatically identifies the topology group for each instance with explicit memory cells, (2 a training modulator, which modulates the learning process of the target GNN model to prevent the case of topology-group-wise under-representation. {\method} can be used as a key component in GNN models to improve their performances under the data imbalance setting. Analyses on both topology-level imbalance and the proposed {\method} are provided theoretically, and we empirically verify its effectiveness with both node-level and graph-level classification as the target tasks.
APA
Zhao, T., Luo, D., Zhang, X. & Wang, S.. (2022). TopoImb: Toward Topology-Level Imbalance in Learning From Graphs. Proceedings of the First Learning on Graphs Conference, in Proceedings of Machine Learning Research 198:37:1-37:18 Available from https://proceedings.mlr.press/v198/zhao22b.html.

Related Material