Models That Are Interpretable But Not Transparent

Chudi Zhong, Panyu Chen, Cynthia Rudin
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:1648-1656, 2025.

Abstract

Faithful explanations are essential for machine learning models in high-stakes applications. Inherently interpretable models are well-suited for these applications because they naturally provide faithful explanations by revealing their decision logic. However, model designers often need to keep these models proprietary to maintain their value. This creates a tension: we need models that are interpretable—allowing human decision-makers to understand and justify predictions, but not transparent, so that the model’s decision boundary is not easily replicated by attackers. Shielding the model’s decision boundary is particularly challenging alongside the requirement of completely faithful explanations, since such explanations reveal the true logic of the model for an entire subspace around each query point. This work provides an approach, FaithfulDefense, that creates model explanations for logical models that are completely faithful, yet reveal as little as possible about the decision boundary. FaithfulDefense is based on a maximum set cover formulation, and we provide multiple formulations for it, taking advantage of submodularity.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-zhong25b, title = {Models That Are Interpretable But Not Transparent}, author = {Zhong, Chudi and Chen, Panyu and Rudin, Cynthia}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {1648--1656}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/zhong25b/zhong25b.pdf}, url = {https://proceedings.mlr.press/v258/zhong25b.html}, abstract = {Faithful explanations are essential for machine learning models in high-stakes applications. Inherently interpretable models are well-suited for these applications because they naturally provide faithful explanations by revealing their decision logic. However, model designers often need to keep these models proprietary to maintain their value. This creates a tension: we need models that are interpretable—allowing human decision-makers to understand and justify predictions, but not transparent, so that the model’s decision boundary is not easily replicated by attackers. Shielding the model’s decision boundary is particularly challenging alongside the requirement of completely faithful explanations, since such explanations reveal the true logic of the model for an entire subspace around each query point. This work provides an approach, FaithfulDefense, that creates model explanations for logical models that are completely faithful, yet reveal as little as possible about the decision boundary. FaithfulDefense is based on a maximum set cover formulation, and we provide multiple formulations for it, taking advantage of submodularity.} }
Endnote
%0 Conference Paper %T Models That Are Interpretable But Not Transparent %A Chudi Zhong %A Panyu Chen %A Cynthia Rudin %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-zhong25b %I PMLR %P 1648--1656 %U https://proceedings.mlr.press/v258/zhong25b.html %V 258 %X Faithful explanations are essential for machine learning models in high-stakes applications. Inherently interpretable models are well-suited for these applications because they naturally provide faithful explanations by revealing their decision logic. However, model designers often need to keep these models proprietary to maintain their value. This creates a tension: we need models that are interpretable—allowing human decision-makers to understand and justify predictions, but not transparent, so that the model’s decision boundary is not easily replicated by attackers. Shielding the model’s decision boundary is particularly challenging alongside the requirement of completely faithful explanations, since such explanations reveal the true logic of the model for an entire subspace around each query point. This work provides an approach, FaithfulDefense, that creates model explanations for logical models that are completely faithful, yet reveal as little as possible about the decision boundary. FaithfulDefense is based on a maximum set cover formulation, and we provide multiple formulations for it, taking advantage of submodularity.
APA
Zhong, C., Chen, P. & Rudin, C.. (2025). Models That Are Interpretable But Not Transparent. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:1648-1656 Available from https://proceedings.mlr.press/v258/zhong25b.html.

Related Material