Concept Bottleneck Model with Zero Performance Loss

Zhenzhen Wang; Aleksander Popel; Jeremias Sulam

Concept Bottleneck Model with Zero Performance Loss

Zhenzhen Wang, Aleksander Popel, Jeremias Sulam

Conference on Parsimony and Learning, PMLR 280:433-461, 2025.

Abstract

Interpreting machine learning models with high-level, human-understandable concepts has gained increasing importance. The concept bottleneck model (CBM) is a popular approach for providing such explanations but typically sacrifices some prediction power compared with standard black-box models. In this work, we propose an approach to turn an off-the-shelf black-box model into a CBM without changing its predictions or compromising prediction power. Through an invertible mapping from the model’s latent space to a concept space, predictions are decomposed into a linear combination of concepts. This provides concept-based explanations for the complex model and allows us to intervene in its predictions manually. Experiments across benchmarks demonstrate that CBM-zero provides comparable explainability and better accuracy than other CBM methods.

Cite this Paper

BibTeX

@InProceedings{pmlr-v280-wang25b,
  title = 	 {Concept Bottleneck Model with Zero Performance Loss},
  author =       {Wang, Zhenzhen and Popel, Aleksander and Sulam, Jeremias},
  booktitle = 	 {Conference on Parsimony and Learning},
  pages = 	 {433--461},
  year = 	 {2025},
  editor = 	 {Chen, Beidi and Liu, Shijia and Pilanci, Mert and Su, Weijie and Sulam, Jeremias and Wang, Yuxiang and Zhu, Zhihui},
  volume = 	 {280},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {24--27 Mar},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v280/main/assets/wang25b/wang25b.pdf},
  url = 	 {https://proceedings.mlr.press/v280/wang25b.html},
  abstract = 	 {Interpreting machine learning models with high-level, human-understandable concepts has gained increasing importance. The concept bottleneck model (CBM) is a popular approach for providing such explanations but typically sacrifices some prediction power compared with standard black-box models. In this work, we propose an approach to turn an off-the-shelf black-box model into a CBM without changing its predictions or compromising prediction power. Through an invertible mapping from the model’s latent space to a concept space, predictions are decomposed into a linear combination of concepts. This provides concept-based explanations for the complex model and allows us to intervene in its predictions manually. Experiments across benchmarks demonstrate that CBM-zero provides comparable explainability and better accuracy than other CBM methods.}
}

Endnote

%0 Conference Paper
%T Concept Bottleneck Model with Zero Performance Loss
%A Zhenzhen Wang
%A Aleksander Popel
%A Jeremias Sulam
%B Conference on Parsimony and Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Beidi Chen
%E Shijia Liu
%E Mert Pilanci
%E Weijie Su
%E Jeremias Sulam
%E Yuxiang Wang
%E Zhihui Zhu	
%F pmlr-v280-wang25b
%I PMLR
%P 433--461
%U https://proceedings.mlr.press/v280/wang25b.html
%V 280
%X Interpreting machine learning models with high-level, human-understandable concepts has gained increasing importance. The concept bottleneck model (CBM) is a popular approach for providing such explanations but typically sacrifices some prediction power compared with standard black-box models. In this work, we propose an approach to turn an off-the-shelf black-box model into a CBM without changing its predictions or compromising prediction power. Through an invertible mapping from the model’s latent space to a concept space, predictions are decomposed into a linear combination of concepts. This provides concept-based explanations for the complex model and allows us to intervene in its predictions manually. Experiments across benchmarks demonstrate that CBM-zero provides comparable explainability and better accuracy than other CBM methods.

APA

Wang, Z., Popel, A. & Sulam, J.. (2025). Concept Bottleneck Model with Zero Performance Loss. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 280:433-461 Available from https://proceedings.mlr.press/v280/wang25b.html.

Concept Bottleneck Model with Zero Performance Loss

Abstract

Cite this Paper

Related Material