Graph Neural Network Explanations are Fragile

Jiate Li, Meng Pang, Yun Dong, Jinyuan Jia, Binghui Wang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:28551-28567, 2024.

Abstract

Explainable Graph Neural Network (GNN) has emerged recently to foster the trust of using GNNs. Existing GNN explainers are developed from various perspectives to enhance the explanation performance. We take the first step to study GNN explainers under adversarial attack—We found that an adversary slightly perturbing graph structure can ensure GNN model makes correct predictions, but the GNN explainer yields a drastically different explanation on the perturbed graph. Specifically, we first formulate the attack problem under a practical threat model (i.e., the adversary has limited knowledge about the GNN explainer and a restricted perturbation budget). We then design two methods (i.e., one is loss-based and the other is deduction-based) to realize the attack. We evaluate our attacks on various GNN explainers and the results show these explainers are fragile.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-li24bd, title = {Graph Neural Network Explanations are Fragile}, author = {Li, Jiate and Pang, Meng and Dong, Yun and Jia, Jinyuan and Wang, Binghui}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {28551--28567}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/li24bd/li24bd.pdf}, url = {https://proceedings.mlr.press/v235/li24bd.html}, abstract = {Explainable Graph Neural Network (GNN) has emerged recently to foster the trust of using GNNs. Existing GNN explainers are developed from various perspectives to enhance the explanation performance. We take the first step to study GNN explainers under adversarial attack—We found that an adversary slightly perturbing graph structure can ensure GNN model makes correct predictions, but the GNN explainer yields a drastically different explanation on the perturbed graph. Specifically, we first formulate the attack problem under a practical threat model (i.e., the adversary has limited knowledge about the GNN explainer and a restricted perturbation budget). We then design two methods (i.e., one is loss-based and the other is deduction-based) to realize the attack. We evaluate our attacks on various GNN explainers and the results show these explainers are fragile.} }
Endnote
%0 Conference Paper %T Graph Neural Network Explanations are Fragile %A Jiate Li %A Meng Pang %A Yun Dong %A Jinyuan Jia %A Binghui Wang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-li24bd %I PMLR %P 28551--28567 %U https://proceedings.mlr.press/v235/li24bd.html %V 235 %X Explainable Graph Neural Network (GNN) has emerged recently to foster the trust of using GNNs. Existing GNN explainers are developed from various perspectives to enhance the explanation performance. We take the first step to study GNN explainers under adversarial attack—We found that an adversary slightly perturbing graph structure can ensure GNN model makes correct predictions, but the GNN explainer yields a drastically different explanation on the perturbed graph. Specifically, we first formulate the attack problem under a practical threat model (i.e., the adversary has limited knowledge about the GNN explainer and a restricted perturbation budget). We then design two methods (i.e., one is loss-based and the other is deduction-based) to realize the attack. We evaluate our attacks on various GNN explainers and the results show these explainers are fragile.
APA
Li, J., Pang, M., Dong, Y., Jia, J. & Wang, B.. (2024). Graph Neural Network Explanations are Fragile. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:28551-28567 Available from https://proceedings.mlr.press/v235/li24bd.html.

Related Material