[edit]
Reconstructing Training Data from Model Gradient, Provably
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:6595-6612, 2023.
Abstract
Understanding when and how much a model gradient leaks information about the training sample is an important question in privacy. In this paper, we present a surprising result: Even without training or memorizing the data, we can fully reconstruct the training samples from a single gradient query at a randomly chosen parameter value. We prove the identifiability of the training data under mild assumptions: with shallow or deep neural networks and wide range of activation functions. We also present a statistically and computationally efficient algorithm based on tensor decomposition to reconstruct the training data. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy, especially in federated learning.