AutoGDA: Automated Graph Data Augmentation for Node Classification

Tong Zhao, Xianfeng Tang, Danqing Zhang, Haoming Jiang, Nikhil Rao, Yiwei Song, Pallav Agrawal, Karthik Subbian, Bing Yin, Meng Jiang
Proceedings of the First Learning on Graphs Conference, PMLR 198:32:1-32:17, 2022.

Abstract

Graph data augmentation has been used to improve generalizability of graph machine learning. However, by only applying fixed augmentation operations on entire graphs, existing methods overlook the unique characteristics of communities which naturally exist in the graphs. For example, different communities can have various degree distributions and homophily ratios. Ignoring such discrepancy with unified augmentation strategies on the entire graph could lead to sub-optimal performance for graph data augmentation methods. In this paper, we study a novel problem of automated graph data augmentation for node classification from the localized perspective of communities. We formulate it as a bilevel optimization problem: finding a set of augmentation strategies for each community, which maximizes the performance of graph neural networks on node classification. As the bilevel optimization is hard to solve directly and the search space for community-customized augmentations strategy is huge, we propose a reinforcement learning framework AutoGDA that learns the local-optimal augmentation strategy for each community sequentially. Our proposed approach outperforms established and popular baselines on public node classification benchmarks as well as real industry e-commerce networks by up to +12.5% accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v198-zhao22a, title = {AutoGDA: Automated Graph Data Augmentation for Node Classification}, author = {Zhao, Tong and Tang, Xianfeng and Zhang, Danqing and Jiang, Haoming and Rao, Nikhil and Song, Yiwei and Agrawal, Pallav and Subbian, Karthik and Yin, Bing and Jiang, Meng}, booktitle = {Proceedings of the First Learning on Graphs Conference}, pages = {32:1--32:17}, year = {2022}, editor = {Rieck, Bastian and Pascanu, Razvan}, volume = {198}, series = {Proceedings of Machine Learning Research}, month = {09--12 Dec}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v198/zhao22a/zhao22a.pdf}, url = {https://proceedings.mlr.press/v198/zhao22a.html}, abstract = {Graph data augmentation has been used to improve generalizability of graph machine learning. However, by only applying fixed augmentation operations on entire graphs, existing methods overlook the unique characteristics of communities which naturally exist in the graphs. For example, different communities can have various degree distributions and homophily ratios. Ignoring such discrepancy with unified augmentation strategies on the entire graph could lead to sub-optimal performance for graph data augmentation methods. In this paper, we study a novel problem of automated graph data augmentation for node classification from the localized perspective of communities. We formulate it as a bilevel optimization problem: finding a set of augmentation strategies for each community, which maximizes the performance of graph neural networks on node classification. As the bilevel optimization is hard to solve directly and the search space for community-customized augmentations strategy is huge, we propose a reinforcement learning framework AutoGDA that learns the local-optimal augmentation strategy for each community sequentially. Our proposed approach outperforms established and popular baselines on public node classification benchmarks as well as real industry e-commerce networks by up to +12.5% accuracy.} }
Endnote
%0 Conference Paper %T AutoGDA: Automated Graph Data Augmentation for Node Classification %A Tong Zhao %A Xianfeng Tang %A Danqing Zhang %A Haoming Jiang %A Nikhil Rao %A Yiwei Song %A Pallav Agrawal %A Karthik Subbian %A Bing Yin %A Meng Jiang %B Proceedings of the First Learning on Graphs Conference %C Proceedings of Machine Learning Research %D 2022 %E Bastian Rieck %E Razvan Pascanu %F pmlr-v198-zhao22a %I PMLR %P 32:1--32:17 %U https://proceedings.mlr.press/v198/zhao22a.html %V 198 %X Graph data augmentation has been used to improve generalizability of graph machine learning. However, by only applying fixed augmentation operations on entire graphs, existing methods overlook the unique characteristics of communities which naturally exist in the graphs. For example, different communities can have various degree distributions and homophily ratios. Ignoring such discrepancy with unified augmentation strategies on the entire graph could lead to sub-optimal performance for graph data augmentation methods. In this paper, we study a novel problem of automated graph data augmentation for node classification from the localized perspective of communities. We formulate it as a bilevel optimization problem: finding a set of augmentation strategies for each community, which maximizes the performance of graph neural networks on node classification. As the bilevel optimization is hard to solve directly and the search space for community-customized augmentations strategy is huge, we propose a reinforcement learning framework AutoGDA that learns the local-optimal augmentation strategy for each community sequentially. Our proposed approach outperforms established and popular baselines on public node classification benchmarks as well as real industry e-commerce networks by up to +12.5% accuracy.
APA
Zhao, T., Tang, X., Zhang, D., Jiang, H., Rao, N., Song, Y., Agrawal, P., Subbian, K., Yin, B. & Jiang, M.. (2022). AutoGDA: Automated Graph Data Augmentation for Node Classification. Proceedings of the First Learning on Graphs Conference, in Proceedings of Machine Learning Research 198:32:1-32:17 Available from https://proceedings.mlr.press/v198/zhao22a.html.

Related Material