Graph Inverse Style Transfer for Counterfactual Explainability

Bardh Prenkaj, Efstratios Zaradoukas, Gjergji Kasneci
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:49803-49829, 2025.

Abstract

Counterfactual explainability seeks to uncover model decisions by identifying minimal changes to the input that alter the predicted outcome. This task becomes particularly challenging for graph data due to preserving structural integrity and semantic meaning. Unlike prior approaches that rely on forward perturbation mechanisms, we introduce Graph Inverse Style Transfer (GIST), the first framework to re-imagine graph counterfactual generation as a backtracking process, leveraging spectral style transfer. By aligning the global structure with the original input spectrum and preserving local content faithfulness, GIST produces valid counterfactuals as interpolations between the input style and counterfactual content. Tested on 8 binary and multi-class graph classification benchmarks, GIST achieves a remarkable +7.6% improvement in the validity of produced counterfactuals and significant gains (+45.5%) in faithfully explaining the true class distribution. Additionally, GIST’s backtracking mechanism effectively mitigates overshooting the underlying predictor’s decision boundary, minimizing the spectral differences between the input and the counterfactuals. These results challenge traditional forward perturbation methods, offering a novel perspective that advances graph explainability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-prenkaj25a, title = {Graph Inverse Style Transfer for Counterfactual Explainability}, author = {Prenkaj, Bardh and Zaradoukas, Efstratios and Kasneci, Gjergji}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {49803--49829}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/prenkaj25a/prenkaj25a.pdf}, url = {https://proceedings.mlr.press/v267/prenkaj25a.html}, abstract = {Counterfactual explainability seeks to uncover model decisions by identifying minimal changes to the input that alter the predicted outcome. This task becomes particularly challenging for graph data due to preserving structural integrity and semantic meaning. Unlike prior approaches that rely on forward perturbation mechanisms, we introduce Graph Inverse Style Transfer (GIST), the first framework to re-imagine graph counterfactual generation as a backtracking process, leveraging spectral style transfer. By aligning the global structure with the original input spectrum and preserving local content faithfulness, GIST produces valid counterfactuals as interpolations between the input style and counterfactual content. Tested on 8 binary and multi-class graph classification benchmarks, GIST achieves a remarkable +7.6% improvement in the validity of produced counterfactuals and significant gains (+45.5%) in faithfully explaining the true class distribution. Additionally, GIST’s backtracking mechanism effectively mitigates overshooting the underlying predictor’s decision boundary, minimizing the spectral differences between the input and the counterfactuals. These results challenge traditional forward perturbation methods, offering a novel perspective that advances graph explainability.} }
Endnote
%0 Conference Paper %T Graph Inverse Style Transfer for Counterfactual Explainability %A Bardh Prenkaj %A Efstratios Zaradoukas %A Gjergji Kasneci %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-prenkaj25a %I PMLR %P 49803--49829 %U https://proceedings.mlr.press/v267/prenkaj25a.html %V 267 %X Counterfactual explainability seeks to uncover model decisions by identifying minimal changes to the input that alter the predicted outcome. This task becomes particularly challenging for graph data due to preserving structural integrity and semantic meaning. Unlike prior approaches that rely on forward perturbation mechanisms, we introduce Graph Inverse Style Transfer (GIST), the first framework to re-imagine graph counterfactual generation as a backtracking process, leveraging spectral style transfer. By aligning the global structure with the original input spectrum and preserving local content faithfulness, GIST produces valid counterfactuals as interpolations between the input style and counterfactual content. Tested on 8 binary and multi-class graph classification benchmarks, GIST achieves a remarkable +7.6% improvement in the validity of produced counterfactuals and significant gains (+45.5%) in faithfully explaining the true class distribution. Additionally, GIST’s backtracking mechanism effectively mitigates overshooting the underlying predictor’s decision boundary, minimizing the spectral differences between the input and the counterfactuals. These results challenge traditional forward perturbation methods, offering a novel perspective that advances graph explainability.
APA
Prenkaj, B., Zaradoukas, E. & Kasneci, G.. (2025). Graph Inverse Style Transfer for Counterfactual Explainability. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:49803-49829 Available from https://proceedings.mlr.press/v267/prenkaj25a.html.

Related Material