Counterfactual Explanations for Conformal Prediction Sets

Aicha Maalej, Cecilia Sönströd, Ulf Johansson
Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 266:405-424, 2025.

Abstract

Conformal classification outputs prediction sets with formal guarantees, making it suitable for uncertainty-aware decision support. However, explaining such prediction sets remains an open challenge, as most existing explanation methods, including counterfactual ones, are tailored to point predictions. In this paper, we introduce a novel form of counterfactual explanations for conformal classifiers. These counterfactuals identify minimal changes that modify the conformal prediction set at a fixed significance level, thereby explaining how and why certain classes are included or excluded. To guide the generation of informative counterfactuals, we consider proximity, sparsity, and plausibility. While proximity and sparsity are commonly used in the literature, we introduce credibility as a new measure of how well a counterfactual conforms to the underlying data distribution, and hence its plausibility. We empirically evaluate our method across multiple tabular datasets and optimization criteria. The findings demonstrate the potential of using counterfactual explanations for conformal classification as informative and trustworthy explanations for conformal prediction sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v266-maalej25a, title = {Counterfactual Explanations for Conformal Prediction Sets}, author = {Maalej, Aicha and S\"{o}nstr\"{o}d, Cecilia and Johansson, Ulf}, booktitle = {Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications}, pages = {405--424}, year = {2025}, editor = {Nguyen, Khuong An and Luo, Zhiyuan and Papadopoulos, Harris and Löfström, Tuwe and Carlsson, Lars and Boström, Henrik}, volume = {266}, series = {Proceedings of Machine Learning Research}, month = {10--12 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v266/main/assets/maalej25a/maalej25a.pdf}, url = {https://proceedings.mlr.press/v266/maalej25a.html}, abstract = {Conformal classification outputs prediction sets with formal guarantees, making it suitable for uncertainty-aware decision support. However, explaining such prediction sets remains an open challenge, as most existing explanation methods, including counterfactual ones, are tailored to point predictions. In this paper, we introduce a novel form of counterfactual explanations for conformal classifiers. These counterfactuals identify minimal changes that modify the conformal prediction set at a fixed significance level, thereby explaining how and why certain classes are included or excluded. To guide the generation of informative counterfactuals, we consider proximity, sparsity, and plausibility. While proximity and sparsity are commonly used in the literature, we introduce credibility as a new measure of how well a counterfactual conforms to the underlying data distribution, and hence its plausibility. We empirically evaluate our method across multiple tabular datasets and optimization criteria. The findings demonstrate the potential of using counterfactual explanations for conformal classification as informative and trustworthy explanations for conformal prediction sets.} }
Endnote
%0 Conference Paper %T Counterfactual Explanations for Conformal Prediction Sets %A Aicha Maalej %A Cecilia Sönströd %A Ulf Johansson %B Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications %C Proceedings of Machine Learning Research %D 2025 %E Khuong An Nguyen %E Zhiyuan Luo %E Harris Papadopoulos %E Tuwe Löfström %E Lars Carlsson %E Henrik Boström %F pmlr-v266-maalej25a %I PMLR %P 405--424 %U https://proceedings.mlr.press/v266/maalej25a.html %V 266 %X Conformal classification outputs prediction sets with formal guarantees, making it suitable for uncertainty-aware decision support. However, explaining such prediction sets remains an open challenge, as most existing explanation methods, including counterfactual ones, are tailored to point predictions. In this paper, we introduce a novel form of counterfactual explanations for conformal classifiers. These counterfactuals identify minimal changes that modify the conformal prediction set at a fixed significance level, thereby explaining how and why certain classes are included or excluded. To guide the generation of informative counterfactuals, we consider proximity, sparsity, and plausibility. While proximity and sparsity are commonly used in the literature, we introduce credibility as a new measure of how well a counterfactual conforms to the underlying data distribution, and hence its plausibility. We empirically evaluate our method across multiple tabular datasets and optimization criteria. The findings demonstrate the potential of using counterfactual explanations for conformal classification as informative and trustworthy explanations for conformal prediction sets.
APA
Maalej, A., Sönströd, C. & Johansson, U.. (2025). Counterfactual Explanations for Conformal Prediction Sets. Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications, in Proceedings of Machine Learning Research 266:405-424 Available from https://proceedings.mlr.press/v266/maalej25a.html.

Related Material