Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution

Eslam Zaher, Maciej Trzaskowski, Quan Nguyen, Fred Roosta
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:58090-58104, 2024.

Abstract

In this paper, we dive into the reliability concerns of Integrated Gradients (IG), a prevalent feature attribution method for black-box deep learning models. We particularly address two predominant challenges associated with IG: the generation of noisy feature visualizations for vision models and the vulnerability to adversarial attributional attacks. Our approach involves an adaptation of path-based feature attribution, aligning the path of attribution more closely to the intrinsic geometry of the data manifold. Our experiments utilise deep generative models applied to several real-world image datasets. They demonstrate that IG along the geodesics conforms to the curved geometry of the Riemannian data manifold, generating more perceptually intuitive explanations and, subsequently, substantially increasing robustness to targeted attributional attacks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-zaher24a, title = {Manifold Integrated Gradients: {R}iemannian Geometry for Feature Attribution}, author = {Zaher, Eslam and Trzaskowski, Maciej and Nguyen, Quan and Roosta, Fred}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {58090--58104}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/zaher24a/zaher24a.pdf}, url = {https://proceedings.mlr.press/v235/zaher24a.html}, abstract = {In this paper, we dive into the reliability concerns of Integrated Gradients (IG), a prevalent feature attribution method for black-box deep learning models. We particularly address two predominant challenges associated with IG: the generation of noisy feature visualizations for vision models and the vulnerability to adversarial attributional attacks. Our approach involves an adaptation of path-based feature attribution, aligning the path of attribution more closely to the intrinsic geometry of the data manifold. Our experiments utilise deep generative models applied to several real-world image datasets. They demonstrate that IG along the geodesics conforms to the curved geometry of the Riemannian data manifold, generating more perceptually intuitive explanations and, subsequently, substantially increasing robustness to targeted attributional attacks.} }
Endnote
%0 Conference Paper %T Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution %A Eslam Zaher %A Maciej Trzaskowski %A Quan Nguyen %A Fred Roosta %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-zaher24a %I PMLR %P 58090--58104 %U https://proceedings.mlr.press/v235/zaher24a.html %V 235 %X In this paper, we dive into the reliability concerns of Integrated Gradients (IG), a prevalent feature attribution method for black-box deep learning models. We particularly address two predominant challenges associated with IG: the generation of noisy feature visualizations for vision models and the vulnerability to adversarial attributional attacks. Our approach involves an adaptation of path-based feature attribution, aligning the path of attribution more closely to the intrinsic geometry of the data manifold. Our experiments utilise deep generative models applied to several real-world image datasets. They demonstrate that IG along the geodesics conforms to the curved geometry of the Riemannian data manifold, generating more perceptually intuitive explanations and, subsequently, substantially increasing robustness to targeted attributional attacks.
APA
Zaher, E., Trzaskowski, M., Nguyen, Q. & Roosta, F.. (2024). Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:58090-58104 Available from https://proceedings.mlr.press/v235/zaher24a.html.

Related Material