An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm

Chris Decarolis, Mukul Ram, Seyed Esmaeili, Yu-Xiang Wang, Furong Huang
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:2421-2431, 2020.

Abstract

We provide an end-to-end differentially private spectral algorithm for learning LDA, based on matrix/tensor decompositions, and establish theoretical guarantees on utility/consistency of the estimated model parameters. We represent the spectral algorithm as a computational graph. Noise can be injected along the edges of this graph to obtain differential privacy. We identify subsets of edges, named “configurations”, such that adding noise to all edges in such a subset guarantees differential privacy of the end-to-end spectral algorithm. We characterize the sensitivity of the edges with respect to the input and thus estimate the amount of noise to be added to each edge for any required privacy level. We then characterize the utility loss for each configuration as a function of injected noise. Overall, by combining the sensitivity and utility characterization, we obtain an end-to-end differentially private spectral algorithm for LDA and identify which configurations outperform others under specific regimes. We are the first to achieve utility guarantees under a required level of differential privacy for learning in LDA. We additionally show that our method systematically outperforms differentially private variational inference.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-decarolis20a, title = {An end-to-end Differentially Private Latent {D}irichlet Allocation Using a Spectral Algorithm}, author = {Decarolis, Chris and Ram, Mukul and Esmaeili, Seyed and Wang, Yu-Xiang and Huang, Furong}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {2421--2431}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/decarolis20a/decarolis20a.pdf}, url = {http://proceedings.mlr.press/v119/decarolis20a.html}, abstract = {We provide an end-to-end differentially private spectral algorithm for learning LDA, based on matrix/tensor decompositions, and establish theoretical guarantees on utility/consistency of the estimated model parameters. We represent the spectral algorithm as a computational graph. Noise can be injected along the edges of this graph to obtain differential privacy. We identify subsets of edges, named “configurations”, such that adding noise to all edges in such a subset guarantees differential privacy of the end-to-end spectral algorithm. We characterize the sensitivity of the edges with respect to the input and thus estimate the amount of noise to be added to each edge for any required privacy level. We then characterize the utility loss for each configuration as a function of injected noise. Overall, by combining the sensitivity and utility characterization, we obtain an end-to-end differentially private spectral algorithm for LDA and identify which configurations outperform others under specific regimes. We are the first to achieve utility guarantees under a required level of differential privacy for learning in LDA. We additionally show that our method systematically outperforms differentially private variational inference.} }
Endnote
%0 Conference Paper %T An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm %A Chris Decarolis %A Mukul Ram %A Seyed Esmaeili %A Yu-Xiang Wang %A Furong Huang %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-decarolis20a %I PMLR %P 2421--2431 %U http://proceedings.mlr.press/v119/decarolis20a.html %V 119 %X We provide an end-to-end differentially private spectral algorithm for learning LDA, based on matrix/tensor decompositions, and establish theoretical guarantees on utility/consistency of the estimated model parameters. We represent the spectral algorithm as a computational graph. Noise can be injected along the edges of this graph to obtain differential privacy. We identify subsets of edges, named “configurations”, such that adding noise to all edges in such a subset guarantees differential privacy of the end-to-end spectral algorithm. We characterize the sensitivity of the edges with respect to the input and thus estimate the amount of noise to be added to each edge for any required privacy level. We then characterize the utility loss for each configuration as a function of injected noise. Overall, by combining the sensitivity and utility characterization, we obtain an end-to-end differentially private spectral algorithm for LDA and identify which configurations outperform others under specific regimes. We are the first to achieve utility guarantees under a required level of differential privacy for learning in LDA. We additionally show that our method systematically outperforms differentially private variational inference.
APA
Decarolis, C., Ram, M., Esmaeili, S., Wang, Y. & Huang, F.. (2020). An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:2421-2431 Available from http://proceedings.mlr.press/v119/decarolis20a.html.

Related Material