Counterfactual generation for Out-of-Distribution data

Nawid Keshtmand, Raul Santos-Rodriguez, Jonathan Lawry
Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), PMLR 307:235-246, 2026.

Abstract

Deploying machine learning models in safety-critical applications necessitates both reliable out-of-distribution (OOD) detection and interpretable model behavior. While substantial progress has been made in OOD detection and explainable AI (XAI), the question of why a model classifies a data point as OOD remains underexplored. Counterfactual explanations are a widely used XAI approach, yet they often fail in OOD contexts, as the generated examples may themselves be OOD. To address this limitation, we introduce the concept of OOD counterfactuals perturbed inputs that transition between distinct OOD categories to provide insight into the model s OOD classification decisions. We propose a novel method for generating OOD counterfactuals and evaluate it on synthetic, tabular, and image datasets. Empirical results demonstrate that our approach offers both quantitatively and qualitatively improved explanations compared to existing baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v307-keshtmand26a, title = {Counterfactual generation for Out-of-Distribution data}, author = {Keshtmand, Nawid and Santos-Rodriguez, Raul and Lawry, Jonathan}, booktitle = {Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL)}, pages = {235--246}, year = {2026}, editor = {Kim, Hyeongji and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {307}, series = {Proceedings of Machine Learning Research}, month = {06--08 Jan}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v307/main/assets/keshtmand26a/keshtmand26a.pdf}, url = {https://proceedings.mlr.press/v307/keshtmand26a.html}, abstract = {Deploying machine learning models in safety-critical applications necessitates both reliable out-of-distribution (OOD) detection and interpretable model behavior. While substantial progress has been made in OOD detection and explainable AI (XAI), the question of why a model classifies a data point as OOD remains underexplored. Counterfactual explanations are a widely used XAI approach, yet they often fail in OOD contexts, as the generated examples may themselves be OOD. To address this limitation, we introduce the concept of OOD counterfactuals perturbed inputs that transition between distinct OOD categories to provide insight into the model s OOD classification decisions. We propose a novel method for generating OOD counterfactuals and evaluate it on synthetic, tabular, and image datasets. Empirical results demonstrate that our approach offers both quantitatively and qualitatively improved explanations compared to existing baselines.} }
Endnote
%0 Conference Paper %T Counterfactual generation for Out-of-Distribution data %A Nawid Keshtmand %A Raul Santos-Rodriguez %A Jonathan Lawry %B Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL) %C Proceedings of Machine Learning Research %D 2026 %E Hyeongji Kim %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v307-keshtmand26a %I PMLR %P 235--246 %U https://proceedings.mlr.press/v307/keshtmand26a.html %V 307 %X Deploying machine learning models in safety-critical applications necessitates both reliable out-of-distribution (OOD) detection and interpretable model behavior. While substantial progress has been made in OOD detection and explainable AI (XAI), the question of why a model classifies a data point as OOD remains underexplored. Counterfactual explanations are a widely used XAI approach, yet they often fail in OOD contexts, as the generated examples may themselves be OOD. To address this limitation, we introduce the concept of OOD counterfactuals perturbed inputs that transition between distinct OOD categories to provide insight into the model s OOD classification decisions. We propose a novel method for generating OOD counterfactuals and evaluate it on synthetic, tabular, and image datasets. Empirical results demonstrate that our approach offers both quantitatively and qualitatively improved explanations compared to existing baselines.
APA
Keshtmand, N., Santos-Rodriguez, R. & Lawry, J.. (2026). Counterfactual generation for Out-of-Distribution data. Proceedings of the 7th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 307:235-246 Available from https://proceedings.mlr.press/v307/keshtmand26a.html.

Related Material