Trustworthy Machine Learning through Data-Specific Indistinguishability

Hanshen Xiao, Zhen Yang, G. Edward Suh
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:68513-68534, 2025.

Abstract

This paper studies a range of AI/ML trust concepts, including memorization, data poisoning, and copyright, which can be modeled as constraints on the influence of data on a (trained) model, characterized by the outcome difference from a processing function (training algorithm). In this realm, we show that provable trust guarantees can be efficiently provided through a new framework termed Data-Specific Indistinguishability (DSI) to select trust-preserving randomization tightly aligning with targeted outcome differences, as a relaxation of the classic Input-Independent Indistinguishability (III). We establish both the theoretical and algorithmic foundations of DSI with the optimal multivariate Gaussian mechanism. We further show its applications to develop trustworthy deep learning with black-box optimizers. The experimental results on memorization mitigation, backdoor defense, and copyright protection show both the efficiency and effectiveness of the DSI noise mechanism.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-xiao25h, title = {Trustworthy Machine Learning through Data-Specific Indistinguishability}, author = {Xiao, Hanshen and Yang, Zhen and Suh, G. Edward}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {68513--68534}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/xiao25h/xiao25h.pdf}, url = {https://proceedings.mlr.press/v267/xiao25h.html}, abstract = {This paper studies a range of AI/ML trust concepts, including memorization, data poisoning, and copyright, which can be modeled as constraints on the influence of data on a (trained) model, characterized by the outcome difference from a processing function (training algorithm). In this realm, we show that provable trust guarantees can be efficiently provided through a new framework termed Data-Specific Indistinguishability (DSI) to select trust-preserving randomization tightly aligning with targeted outcome differences, as a relaxation of the classic Input-Independent Indistinguishability (III). We establish both the theoretical and algorithmic foundations of DSI with the optimal multivariate Gaussian mechanism. We further show its applications to develop trustworthy deep learning with black-box optimizers. The experimental results on memorization mitigation, backdoor defense, and copyright protection show both the efficiency and effectiveness of the DSI noise mechanism.} }
Endnote
%0 Conference Paper %T Trustworthy Machine Learning through Data-Specific Indistinguishability %A Hanshen Xiao %A Zhen Yang %A G. Edward Suh %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-xiao25h %I PMLR %P 68513--68534 %U https://proceedings.mlr.press/v267/xiao25h.html %V 267 %X This paper studies a range of AI/ML trust concepts, including memorization, data poisoning, and copyright, which can be modeled as constraints on the influence of data on a (trained) model, characterized by the outcome difference from a processing function (training algorithm). In this realm, we show that provable trust guarantees can be efficiently provided through a new framework termed Data-Specific Indistinguishability (DSI) to select trust-preserving randomization tightly aligning with targeted outcome differences, as a relaxation of the classic Input-Independent Indistinguishability (III). We establish both the theoretical and algorithmic foundations of DSI with the optimal multivariate Gaussian mechanism. We further show its applications to develop trustworthy deep learning with black-box optimizers. The experimental results on memorization mitigation, backdoor defense, and copyright protection show both the efficiency and effectiveness of the DSI noise mechanism.
APA
Xiao, H., Yang, Z. & Suh, G.E.. (2025). Trustworthy Machine Learning through Data-Specific Indistinguishability. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:68513-68534 Available from https://proceedings.mlr.press/v267/xiao25h.html.

Related Material