Mitigating Label Noise on Graphs via Topological Sample Selection

Yuhao Wu, Jiangchao Yao, Xiaobo Xia, Jun Yu, Ruxin Wang, Bo Han, Tongliang Liu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:53944-53972, 2024.

Abstract

Despite the success of the carefully-annotated benchmarks, the effectiveness of existing graph neural networks (GNNs) can be considerably impaired in practice when the real-world graph data is noisily labeled. Previous explorations in sample selection have been demonstrated as an effective way for robust learning with noisy labels, however, the conventional studies focus on i.i.d data, and when moving to non-iid graph data and GNNs, two notable challenges remain: (1) nodes located near topological class boundaries are very informative for classification but cannot be successfully distinguished by the heuristic sample selection. (2) there is no available measure that considers the graph topological information to promote sample selection in a graph. To address this dilemma, we propose a $\textit{Topological Sample Selection}$ (TSS) method that boosts the informative sample selection process in a graph by utilising topological information. We theoretically prove that our procedure minimizes an upper bound of the expected risk under target clean distribution, and experimentally show the superiority of our method compared with state-of-the-art baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-wu24ae, title = {Mitigating Label Noise on Graphs via Topological Sample Selection}, author = {Wu, Yuhao and Yao, Jiangchao and Xia, Xiaobo and Yu, Jun and Wang, Ruxin and Han, Bo and Liu, Tongliang}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {53944--53972}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/wu24ae/wu24ae.pdf}, url = {https://proceedings.mlr.press/v235/wu24ae.html}, abstract = {Despite the success of the carefully-annotated benchmarks, the effectiveness of existing graph neural networks (GNNs) can be considerably impaired in practice when the real-world graph data is noisily labeled. Previous explorations in sample selection have been demonstrated as an effective way for robust learning with noisy labels, however, the conventional studies focus on i.i.d data, and when moving to non-iid graph data and GNNs, two notable challenges remain: (1) nodes located near topological class boundaries are very informative for classification but cannot be successfully distinguished by the heuristic sample selection. (2) there is no available measure that considers the graph topological information to promote sample selection in a graph. To address this dilemma, we propose a $\textit{Topological Sample Selection}$ (TSS) method that boosts the informative sample selection process in a graph by utilising topological information. We theoretically prove that our procedure minimizes an upper bound of the expected risk under target clean distribution, and experimentally show the superiority of our method compared with state-of-the-art baselines.} }
Endnote
%0 Conference Paper %T Mitigating Label Noise on Graphs via Topological Sample Selection %A Yuhao Wu %A Jiangchao Yao %A Xiaobo Xia %A Jun Yu %A Ruxin Wang %A Bo Han %A Tongliang Liu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-wu24ae %I PMLR %P 53944--53972 %U https://proceedings.mlr.press/v235/wu24ae.html %V 235 %X Despite the success of the carefully-annotated benchmarks, the effectiveness of existing graph neural networks (GNNs) can be considerably impaired in practice when the real-world graph data is noisily labeled. Previous explorations in sample selection have been demonstrated as an effective way for robust learning with noisy labels, however, the conventional studies focus on i.i.d data, and when moving to non-iid graph data and GNNs, two notable challenges remain: (1) nodes located near topological class boundaries are very informative for classification but cannot be successfully distinguished by the heuristic sample selection. (2) there is no available measure that considers the graph topological information to promote sample selection in a graph. To address this dilemma, we propose a $\textit{Topological Sample Selection}$ (TSS) method that boosts the informative sample selection process in a graph by utilising topological information. We theoretically prove that our procedure minimizes an upper bound of the expected risk under target clean distribution, and experimentally show the superiority of our method compared with state-of-the-art baselines.
APA
Wu, Y., Yao, J., Xia, X., Yu, J., Wang, R., Han, B. & Liu, T.. (2024). Mitigating Label Noise on Graphs via Topological Sample Selection. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:53944-53972 Available from https://proceedings.mlr.press/v235/wu24ae.html.

Related Material