[edit]
Algorithmically Effective Differentially Private Synthetic Data
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:3941-3968, 2023.
Abstract
We present a highly effective algorithmic approach for generating ε-differentially private synthetic data in a bounded metric space with near-optimal utility guarantees under the 1-Wasserstein distance. In particular, for a dataset X in the hypercube [0,1]d, our algorithm generates synthetic dataset Y such that the expected 1-Wasserstein distance between the empirical measure of X and Y is O((εn)−1/d) for d≥2, and is O(log2(εn)(εn)−1) for d=1. The accuracy guarantee is optimal up to a constant factor for d≥2, and up to a logarithmic factor for d=1. Our algorithm has a fast running time of O(εdn) for all d≥1 and demonstrates improved accuracy compared to the method in Boedihardjo et al. (2022) for d≥2.