Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks

Feng Ji, See Hian Lee, Hanyang Meng, Kai Zhao, Jielong Yang, Wee Peng Tay
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:14869-14885, 2023.

Abstract

In node classification using graph neural networks (GNNs), a typical model generates logits for different class labels at each node. A softmax layer often outputs a label prediction based on the largest logit. We demonstrate that it is possible to infer hidden graph structural information from the dataset using these logits. We introduce the key notion of label non-uniformity, which is derived from the Wasserstein distance between the softmax distribution of the logits and the uniform distribution. We demonstrate that nodes with small label non-uniformity are harder to classify correctly. We theoretically analyze how the label non-uniformity varies across the graph, which provides insights into boosting the model performance: increasing training samples with high non-uniformity or dropping edges to reduce the maximal cut size of the node set of small non-uniformity. These mechanisms can be easily added to a base GNN model. Experimental results demonstrate that our approach improves the performance of many benchmark base models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-ji23a, title = {Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks}, author = {Ji, Feng and Lee, See Hian and Meng, Hanyang and Zhao, Kai and Yang, Jielong and Tay, Wee Peng}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {14869--14885}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/ji23a/ji23a.pdf}, url = {https://proceedings.mlr.press/v202/ji23a.html}, abstract = {In node classification using graph neural networks (GNNs), a typical model generates logits for different class labels at each node. A softmax layer often outputs a label prediction based on the largest logit. We demonstrate that it is possible to infer hidden graph structural information from the dataset using these logits. We introduce the key notion of label non-uniformity, which is derived from the Wasserstein distance between the softmax distribution of the logits and the uniform distribution. We demonstrate that nodes with small label non-uniformity are harder to classify correctly. We theoretically analyze how the label non-uniformity varies across the graph, which provides insights into boosting the model performance: increasing training samples with high non-uniformity or dropping edges to reduce the maximal cut size of the node set of small non-uniformity. These mechanisms can be easily added to a base GNN model. Experimental results demonstrate that our approach improves the performance of many benchmark base models.} }
Endnote
%0 Conference Paper %T Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks %A Feng Ji %A See Hian Lee %A Hanyang Meng %A Kai Zhao %A Jielong Yang %A Wee Peng Tay %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-ji23a %I PMLR %P 14869--14885 %U https://proceedings.mlr.press/v202/ji23a.html %V 202 %X In node classification using graph neural networks (GNNs), a typical model generates logits for different class labels at each node. A softmax layer often outputs a label prediction based on the largest logit. We demonstrate that it is possible to infer hidden graph structural information from the dataset using these logits. We introduce the key notion of label non-uniformity, which is derived from the Wasserstein distance between the softmax distribution of the logits and the uniform distribution. We demonstrate that nodes with small label non-uniformity are harder to classify correctly. We theoretically analyze how the label non-uniformity varies across the graph, which provides insights into boosting the model performance: increasing training samples with high non-uniformity or dropping edges to reduce the maximal cut size of the node set of small non-uniformity. These mechanisms can be easily added to a base GNN model. Experimental results demonstrate that our approach improves the performance of many benchmark base models.
APA
Ji, F., Lee, S.H., Meng, H., Zhao, K., Yang, J. & Tay, W.P.. (2023). Leveraging Label Non-Uniformity for Node Classification in Graph Neural Networks. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:14869-14885 Available from https://proceedings.mlr.press/v202/ji23a.html.

Related Material