Local Differential Privacy for Sampling
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3404-3413, 2020.
Differential privacy (DP) is a leading privacy protection focused by design on individual privacy. In the local model of DP, strong privacy is achieved by privatizing each user’s individual data before sending it to an untrusted aggregator for analysis. While in recent years local DP has been adopted for practical deployments, most research in this area focuses on problems where each individual holds a single data record. In many problems of practical interest this assumption is unrealistic since nowadays most user-owned devices collect large quantities of data (e.g. pictures, text messages, time series). We propose to model this scenario by assuming each individual holds a distribution over the space of data records, and develop novel local DP methods to sample privately from these distributions. Our main contribution is a boosting-based density estimation algorithm for learning samplers that generate synthetic data while protecting the underlying distribution of each user with local DP. We give approximation guarantees quantifying how well these samplers approximate the true distribution. Experimental results against DP kernel density estimation and DP GANs displays the quality of our results.