ExpProof : Operationalizing Explanations for Confidential Models with ZKPs

Chhavi Yadav, Evan Laufer, Dan Boneh, Kamalika Chaudhuri
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:70183-70204, 2025.

Abstract

In principle, explanations are intended as a way to increase trust in machine learning models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand. In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable versions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests. Our code is publicly available at : https://github.com/emlaufer/ExpProof.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-yadav25a, title = {{E}xp{P}roof : Operationalizing Explanations for Confidential Models with {ZKP}s}, author = {Yadav, Chhavi and Laufer, Evan and Boneh, Dan and Chaudhuri, Kamalika}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {70183--70204}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/yadav25a/yadav25a.pdf}, url = {https://proceedings.mlr.press/v267/yadav25a.html}, abstract = {In principle, explanations are intended as a way to increase trust in machine learning models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand. In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable versions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests. Our code is publicly available at : https://github.com/emlaufer/ExpProof.} }
Endnote
%0 Conference Paper %T ExpProof : Operationalizing Explanations for Confidential Models with ZKPs %A Chhavi Yadav %A Evan Laufer %A Dan Boneh %A Kamalika Chaudhuri %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-yadav25a %I PMLR %P 70183--70204 %U https://proceedings.mlr.press/v267/yadav25a.html %V 267 %X In principle, explanations are intended as a way to increase trust in machine learning models and are often obligated by regulations. However, many circumstances where these are demanded are adversarial in nature, meaning the involved parties have misaligned interests and are incentivized to manipulate explanations for their purpose. As a result, explainability methods fail to be operational in such settings despite the demand. In this paper, we take a step towards operationalizing explanations in adversarial scenarios with Zero-Knowledge Proofs (ZKPs), a cryptographic primitive. Specifically we explore ZKP-amenable versions of the popular explainability algorithm LIME and evaluate their performance on Neural Networks and Random Forests. Our code is publicly available at : https://github.com/emlaufer/ExpProof.
APA
Yadav, C., Laufer, E., Boneh, D. & Chaudhuri, K.. (2025). ExpProof : Operationalizing Explanations for Confidential Models with ZKPs. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:70183-70204 Available from https://proceedings.mlr.press/v267/yadav25a.html.

Related Material