Obfuscation via Information Density Estimation

Hsiang Hsu; Shahab Asoodeh; Flavio Calmon

Obfuscation via Information Density Estimation

Hsiang Hsu, Shahab Asoodeh, Flavio Calmon

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:906-917, 2020.

Abstract

Identifying features that leak information about sensitive attributes is a key challenge in the design of information obfuscation mechanisms. In this paper, we propose a framework to identify information-leaking features via information density estimation. Here, features whose information densities exceed a pre-defined threshold are deemed information-leaking features. Once these features are identified, we sequentially pass them through a targeted obfuscation mechanism with a provable leakage guarantee in terms of $\mathsf{E}_\gamma$-divergence. The core of this mechanism relies on a data-driven estimate of the trimmed information density for which we propose a novel estimator, named the \textit{trimmed information density estimator} (TIDE). We then use TIDE to implement our mechanism on three real-world datasets. Our approach can be used as a data-driven pipeline for designing obfuscation mechanisms targeting specific features.

Cite this Paper

BibTeX


@InProceedings{pmlr-v108-hsu20a,
  title = 	 {Obfuscation via Information Density Estimation},
  author =       {Hsu, Hsiang and Asoodeh, Shahab and Calmon, Flavio},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {906--917},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/hsu20a/hsu20a.pdf},
  url = 	 {https://proceedings.mlr.press/v108/hsu20a.html},
  abstract = 	 {Identifying features that leak information about sensitive attributes is a key challenge in the design of information obfuscation mechanisms. In this paper, we propose a framework to identify information-leaking features via information density estimation. Here, features whose information densities exceed a pre-defined threshold are deemed information-leaking features. Once these features are identified, we sequentially pass them through a targeted obfuscation mechanism with a provable leakage guarantee in terms of $\mathsf{E}_\gamma$-divergence. The core of this mechanism relies on a data-driven estimate of the trimmed information density for which we propose a novel estimator, named the \textit{trimmed information density estimator} (TIDE). We then use TIDE to implement our mechanism on three real-world datasets. Our approach can be used as a data-driven pipeline for designing obfuscation mechanisms targeting specific features.}
}

Endnote

%0 Conference Paper
%T Obfuscation via Information Density Estimation
%A Hsiang Hsu
%A Shahab Asoodeh
%A Flavio Calmon
%B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2020
%E Silvia Chiappa
%E Roberto Calandra	
%F pmlr-v108-hsu20a
%I PMLR
%P 906--917
%U https://proceedings.mlr.press/v108/hsu20a.html
%V 108
%X Identifying features that leak information about sensitive attributes is a key challenge in the design of information obfuscation mechanisms. In this paper, we propose a framework to identify information-leaking features via information density estimation. Here, features whose information densities exceed a pre-defined threshold are deemed information-leaking features. Once these features are identified, we sequentially pass them through a targeted obfuscation mechanism with a provable leakage guarantee in terms of $\mathsf{E}_\gamma$-divergence. The core of this mechanism relies on a data-driven estimate of the trimmed information density for which we propose a novel estimator, named the \textit{trimmed information density estimator} (TIDE). We then use TIDE to implement our mechanism on three real-world datasets. Our approach can be used as a data-driven pipeline for designing obfuscation mechanisms targeting specific features.

APA


Hsu, H., Asoodeh, S. & Calmon, F.. (2020). Obfuscation via Information Density Estimation. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:906-917 Available from https://proceedings.mlr.press/v108/hsu20a.html.

Obfuscation via Information Density Estimation

Abstract

Cite this Paper

Related Material