Understanding Inter-Concept Relationships in Concept-Based Models

Naveen Janaki Raman, Mateo Espinosa Zarlenga, Mateja Jamnik
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:42009-42025, 2024.

Abstract

Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between concepts when solving tasks, it is unclear whether concept-based methods incorporate the rich structure of inter-concept relationships. We analyse the concept representations learnt by concept-based models to understand whether these models correctly capture inter-concept relationships. First, we empirically demonstrate that state-of-the-art concept-based models produce representations that lack stability and robustness, and such methods fail to capture inter-concept relationships. Then, we develop a novel algorithm which leverages inter-concept relationships to improve concept intervention accuracy, demonstrating how correctly capturing inter-concept relationships can improve downstream tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-raman24a, title = {Understanding Inter-Concept Relationships in Concept-Based Models}, author = {Raman, Naveen Janaki and Espinosa Zarlenga, Mateo and Jamnik, Mateja}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {42009--42025}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/raman24a/raman24a.pdf}, url = {https://proceedings.mlr.press/v235/raman24a.html}, abstract = {Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between concepts when solving tasks, it is unclear whether concept-based methods incorporate the rich structure of inter-concept relationships. We analyse the concept representations learnt by concept-based models to understand whether these models correctly capture inter-concept relationships. First, we empirically demonstrate that state-of-the-art concept-based models produce representations that lack stability and robustness, and such methods fail to capture inter-concept relationships. Then, we develop a novel algorithm which leverages inter-concept relationships to improve concept intervention accuracy, demonstrating how correctly capturing inter-concept relationships can improve downstream tasks.} }
Endnote
%0 Conference Paper %T Understanding Inter-Concept Relationships in Concept-Based Models %A Naveen Janaki Raman %A Mateo Espinosa Zarlenga %A Mateja Jamnik %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-raman24a %I PMLR %P 42009--42025 %U https://proceedings.mlr.press/v235/raman24a.html %V 235 %X Concept-based explainability methods provide insight into deep learning systems by constructing explanations using human-understandable concepts. While the literature on human reasoning demonstrates that we exploit relationships between concepts when solving tasks, it is unclear whether concept-based methods incorporate the rich structure of inter-concept relationships. We analyse the concept representations learnt by concept-based models to understand whether these models correctly capture inter-concept relationships. First, we empirically demonstrate that state-of-the-art concept-based models produce representations that lack stability and robustness, and such methods fail to capture inter-concept relationships. Then, we develop a novel algorithm which leverages inter-concept relationships to improve concept intervention accuracy, demonstrating how correctly capturing inter-concept relationships can improve downstream tasks.
APA
Raman, N.J., Espinosa Zarlenga, M. & Jamnik, M.. (2024). Understanding Inter-Concept Relationships in Concept-Based Models. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:42009-42025 Available from https://proceedings.mlr.press/v235/raman24a.html.

Related Material