Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals

Susu Sun; Stefano Woerner; Andreas Maier; Lisa M. Koch; Christian F. Baumgartner

Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals

Susu Sun, Stefano Woerner, Andreas Maier, Lisa M. Koch, Christian F. Baumgartner

Medical Imaging with Deep Learning, PMLR 227:937-956, 2024.

Abstract

Interpretability is essential for machine learning algorithms in high-stakes application fields such as medical image analysis. However, high-performing black-box neural networks do not provide explanations for their predictions, which can lead to mistrust and suboptimal human-ML collaboration. Post-hoc explanation techniques, which are widely used in practice, have been shown to suffer from severe conceptual problems. Furthermore, as we show in this paper, current explanation techniques do not perform adequately in the multi-label scenario, in which multiple medical findings may co-occur in a single image. We propose Attri-Net, an inherently interpretable model for multi-label classification. Attri-Net is a powerful classifier that provides transparent, trustworthy, and human-understandable explanations. The model first generates class-specific attribution maps based on counterfactuals to identify which image regions correspond to certain medical findings. Then a simple logistic regression classifier is used to make predictions based solely on these attribution maps. We compare Attri-Net to five post-hoc explanation techniques and one inherently interpretable classifier on three chest X-ray datasets. We find that Attri-Net produces high-quality multi-label explanations consistent with clinical knowledge and has comparable classification performance to state-of-the-art classification models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v227-sun24a,
  title = 	 {Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals},
  author =       {Sun, Susu and Woerner, Stefano and Maier, Andreas and Koch, Lisa M. and Baumgartner, Christian F.},
  booktitle = 	 {Medical Imaging with Deep Learning},
  pages = 	 {937--956},
  year = 	 {2024},
  editor = 	 {Oguz, Ipek and Noble, Jack and Li, Xiaoxiao and Styner, Martin and Baumgartner, Christian and Rusu, Mirabela and Heinmann, Tobias and Kontos, Despina and Landman, Bennett and Dawant, Benoit},
  volume = 	 {227},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--12 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v227/sun24a/sun24a.pdf},
  url = 	 {https://proceedings.mlr.press/v227/sun24a.html},
  abstract = 	 {Interpretability is essential for machine learning algorithms in high-stakes application fields such as medical image analysis. However, high-performing black-box neural networks do not provide explanations for their predictions, which can lead to mistrust and suboptimal human-ML collaboration. Post-hoc explanation techniques, which are widely used in practice, have been shown to suffer from severe conceptual problems. Furthermore, as we show in this paper, current explanation techniques do not perform adequately in the multi-label scenario, in which multiple medical findings may co-occur in a single image. We propose Attri-Net, an inherently interpretable model for multi-label classification. Attri-Net is a powerful classifier that provides transparent, trustworthy, and human-understandable explanations. The model first generates class-specific attribution maps based on counterfactuals to identify which image regions correspond to certain medical findings. Then a simple logistic regression classifier is used to make predictions based solely on these attribution maps. We compare Attri-Net to five post-hoc explanation techniques and one inherently interpretable classifier on three chest X-ray datasets. We find that Attri-Net produces high-quality multi-label explanations consistent with clinical knowledge and has comparable classification performance to state-of-the-art classification models.}
}

Endnote

%0 Conference Paper
%T Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals
%A Susu Sun
%A Stefano Woerner
%A Andreas Maier
%A Lisa M. Koch
%A Christian F. Baumgartner
%B Medical Imaging with Deep Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ipek Oguz
%E Jack Noble
%E Xiaoxiao Li
%E Martin Styner
%E Christian Baumgartner
%E Mirabela Rusu
%E Tobias Heinmann
%E Despina Kontos
%E Bennett Landman
%E Benoit Dawant	
%F pmlr-v227-sun24a
%I PMLR
%P 937--956
%U https://proceedings.mlr.press/v227/sun24a.html
%V 227
%X Interpretability is essential for machine learning algorithms in high-stakes application fields such as medical image analysis. However, high-performing black-box neural networks do not provide explanations for their predictions, which can lead to mistrust and suboptimal human-ML collaboration. Post-hoc explanation techniques, which are widely used in practice, have been shown to suffer from severe conceptual problems. Furthermore, as we show in this paper, current explanation techniques do not perform adequately in the multi-label scenario, in which multiple medical findings may co-occur in a single image. We propose Attri-Net, an inherently interpretable model for multi-label classification. Attri-Net is a powerful classifier that provides transparent, trustworthy, and human-understandable explanations. The model first generates class-specific attribution maps based on counterfactuals to identify which image regions correspond to certain medical findings. Then a simple logistic regression classifier is used to make predictions based solely on these attribution maps. We compare Attri-Net to five post-hoc explanation techniques and one inherently interpretable classifier on three chest X-ray datasets. We find that Attri-Net produces high-quality multi-label explanations consistent with clinical knowledge and has comparable classification performance to state-of-the-art classification models.

APA


Sun, S., Woerner, S., Maier, A., Koch, L.M. & Baumgartner, C.F.. (2024). Inherently Interpretable Multi-Label Classification Using Class-Specific Counterfactuals. Medical Imaging with Deep Learning, in Proceedings of Machine Learning Research 227:937-956 Available from https://proceedings.mlr.press/v227/sun24a.html.

Related Material

Download PDF