Position: Do Not Explain Vision Models Without Context

Paulina Tomaszewska, Przemyslaw Biecek
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:48390-48403, 2024.

Abstract

Does the stethoscope in the picture make the adjacent person a doctor or a patient? This, of course, depends on the contextual relationship of the two objects. If it’s obvious, why don’t explanation methods for vision models use contextual information? In this paper, we (1) review the most popular methods of explaining computer vision models by pointing out that they do not take into account context information, (2) show examples of failures of popular XAI methods, (3) provide examples of real-world use cases where spatial context plays a significant role, (4) propose new research directions that may lead to better use of context information in explaining computer vision models, (5) argue that a change in approach to explanations is needed from where to how.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-tomaszewska24a, title = {Position: Do Not Explain Vision Models Without Context}, author = {Tomaszewska, Paulina and Biecek, Przemyslaw}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {48390--48403}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/tomaszewska24a/tomaszewska24a.pdf}, url = {https://proceedings.mlr.press/v235/tomaszewska24a.html}, abstract = {Does the stethoscope in the picture make the adjacent person a doctor or a patient? This, of course, depends on the contextual relationship of the two objects. If it’s obvious, why don’t explanation methods for vision models use contextual information? In this paper, we (1) review the most popular methods of explaining computer vision models by pointing out that they do not take into account context information, (2) show examples of failures of popular XAI methods, (3) provide examples of real-world use cases where spatial context plays a significant role, (4) propose new research directions that may lead to better use of context information in explaining computer vision models, (5) argue that a change in approach to explanations is needed from where to how.} }
Endnote
%0 Conference Paper %T Position: Do Not Explain Vision Models Without Context %A Paulina Tomaszewska %A Przemyslaw Biecek %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-tomaszewska24a %I PMLR %P 48390--48403 %U https://proceedings.mlr.press/v235/tomaszewska24a.html %V 235 %X Does the stethoscope in the picture make the adjacent person a doctor or a patient? This, of course, depends on the contextual relationship of the two objects. If it’s obvious, why don’t explanation methods for vision models use contextual information? In this paper, we (1) review the most popular methods of explaining computer vision models by pointing out that they do not take into account context information, (2) show examples of failures of popular XAI methods, (3) provide examples of real-world use cases where spatial context plays a significant role, (4) propose new research directions that may lead to better use of context information in explaining computer vision models, (5) argue that a change in approach to explanations is needed from where to how.
APA
Tomaszewska, P. & Biecek, P.. (2024). Position: Do Not Explain Vision Models Without Context. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:48390-48403 Available from https://proceedings.mlr.press/v235/tomaszewska24a.html.

Related Material