The Price of Differential Privacy for Online Learning

Naman Agarwal; Karan Singh

The Price of Differential Privacy for Online Learning

Naman Agarwal, Karan Singh

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:32-40, 2017.

Abstract

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.

Cite this Paper

BibTeX

@InProceedings{pmlr-v70-agarwal17a,
  title = 	 {The Price of Differential Privacy for Online Learning},
  author =       {Naman Agarwal and Karan Singh},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {32--40},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/agarwal17a/agarwal17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/agarwal17a.html},
  abstract = 	 {We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.}
}

Endnote

%0 Conference Paper
%T The Price of Differential Privacy for Online Learning
%A Naman Agarwal
%A Karan Singh
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-agarwal17a
%I PMLR
%P 32--40
%U https://proceedings.mlr.press/v70/agarwal17a.html
%V 70
%X We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.

APA

Agarwal, N. & Singh, K.. (2017). The Price of Differential Privacy for Online Learning. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:32-40 Available from https://proceedings.mlr.press/v70/agarwal17a.html.

The Price of Differential Privacy for Online Learning

Abstract

Cite this Paper

Related Material