Tighter Generalization Bounds for Iterative Differentially Private Learning Algorithms
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, PMLR 161:802-812, 2021.
This paper studies the relationship between generalization and privacy preservation of machine learning in two steps. We first establish an alignment between the two facets for any learning algorithm. We prove that $(\varepsilon, \delta)$-differential privacy implies an on-average generalization bound for a multi-sample-set learning algorithm, which further leads to a high-probability bound for any learning algorithm. We then investigate how the iterative nature shared by most learning algorithms influences privacy preservation and further generalization. Three composition theorems are proved to approximate the differential privacy of an iterative algorithm through the differential privacy of its every iteration. Integrating the above two steps, we eventually deliver generalization bounds for iterative learning algorithms. Our results are strictly tighter than the existing works. Particularly, our generalization bounds do not rely on the model size which is prohibitively large in deep learning. Experiments of MLP, VGG, and ResNet on MNIST, CIFAR-10, and CIFAR-100 are in full agreement with our theory. The theory applies to a wide spectrum of learning algorithms. In this paper, it is applied to the Gaussian mechanism as an example.