Quantifying the Privacy Risks of Learning High-Dimensional Graphical Models

Sasi Kumar Murakonda, Reza Shokri, George Theodorakopoulos
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2287-2295, 2021.

Abstract

Models leak information about their training data. This enables attackers to infer sensitive information about their training sets, notably determine if a data sample was part of the model’s training set. The existing works empirically show the possibility of these membership inference (tracing) attacks against complex deep learning models. However, the attack results are dependent on the specific training data, can be obtained only after the tedious process of training the model and performing the attack, and are missing any measure of the confidence and unused potential power of the attack. In this paper, we theoretically analyze the maximum power of tracing attacks against high-dimensional graphical models, with the focus on Bayesian networks. We provide a tight upper bound on the power (true positive rate) of these attacks, with respect to their error (false positive rate), for a given model structure even before learning its parameters. As it should be, the bound is independent of the knowledge and algorithm of any specific attack. It can help in identifying which model structures leak more information, how adding new parameters to the model increases its privacy risk, and what can be gained by adding new data points to decrease the overall information leakage. It provides a measure of the potential leakage of a model given its structure, as a function of the model complexity and the size of the training set.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-kumar-murakonda21a, title = { Quantifying the Privacy Risks of Learning High-Dimensional Graphical Models }, author = {Kumar Murakonda, Sasi and Shokri, Reza and Theodorakopoulos, George}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {2287--2295}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/kumar-murakonda21a/kumar-murakonda21a.pdf}, url = {https://proceedings.mlr.press/v130/kumar-murakonda21a.html}, abstract = { Models leak information about their training data. This enables attackers to infer sensitive information about their training sets, notably determine if a data sample was part of the model’s training set. The existing works empirically show the possibility of these membership inference (tracing) attacks against complex deep learning models. However, the attack results are dependent on the specific training data, can be obtained only after the tedious process of training the model and performing the attack, and are missing any measure of the confidence and unused potential power of the attack. In this paper, we theoretically analyze the maximum power of tracing attacks against high-dimensional graphical models, with the focus on Bayesian networks. We provide a tight upper bound on the power (true positive rate) of these attacks, with respect to their error (false positive rate), for a given model structure even before learning its parameters. As it should be, the bound is independent of the knowledge and algorithm of any specific attack. It can help in identifying which model structures leak more information, how adding new parameters to the model increases its privacy risk, and what can be gained by adding new data points to decrease the overall information leakage. It provides a measure of the potential leakage of a model given its structure, as a function of the model complexity and the size of the training set. } }
Endnote
%0 Conference Paper %T Quantifying the Privacy Risks of Learning High-Dimensional Graphical Models %A Sasi Kumar Murakonda %A Reza Shokri %A George Theodorakopoulos %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-kumar-murakonda21a %I PMLR %P 2287--2295 %U https://proceedings.mlr.press/v130/kumar-murakonda21a.html %V 130 %X Models leak information about their training data. This enables attackers to infer sensitive information about their training sets, notably determine if a data sample was part of the model’s training set. The existing works empirically show the possibility of these membership inference (tracing) attacks against complex deep learning models. However, the attack results are dependent on the specific training data, can be obtained only after the tedious process of training the model and performing the attack, and are missing any measure of the confidence and unused potential power of the attack. In this paper, we theoretically analyze the maximum power of tracing attacks against high-dimensional graphical models, with the focus on Bayesian networks. We provide a tight upper bound on the power (true positive rate) of these attacks, with respect to their error (false positive rate), for a given model structure even before learning its parameters. As it should be, the bound is independent of the knowledge and algorithm of any specific attack. It can help in identifying which model structures leak more information, how adding new parameters to the model increases its privacy risk, and what can be gained by adding new data points to decrease the overall information leakage. It provides a measure of the potential leakage of a model given its structure, as a function of the model complexity and the size of the training set.
APA
Kumar Murakonda, S., Shokri, R. & Theodorakopoulos, G.. (2021). Quantifying the Privacy Risks of Learning High-Dimensional Graphical Models . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2287-2295 Available from https://proceedings.mlr.press/v130/kumar-murakonda21a.html.

Related Material