The Price of Differential Privacy for Online Learning

[edit]

Naman Agarwal, Karan Singh ;
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:32-40, 2017.

Abstract

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.

Related Material