The Price of Differential Privacy for Online Learning

Naman Agarwal, Karan Singh
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:32-40, 2017.

Abstract

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-agarwal17a, title = {The Price of Differential Privacy for Online Learning}, author = {Naman Agarwal and Karan Singh}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {32--40}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/agarwal17a/agarwal17a.pdf}, url = {https://proceedings.mlr.press/v70/agarwal17a.html}, abstract = {We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.} }
Endnote
%0 Conference Paper %T The Price of Differential Privacy for Online Learning %A Naman Agarwal %A Karan Singh %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-agarwal17a %I PMLR %P 32--40 %U https://proceedings.mlr.press/v70/agarwal17a.html %V 70 %X We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(T^{0.5})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free – in particular, the regret bounds scale as $O(T^{0.5}+1/\epsilon)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O(T^{0.5}/\epsilon)$, while the previously best known regret bound was $O(T^{2/3}/\epsilon)$.
APA
Agarwal, N. & Singh, K.. (2017). The Price of Differential Privacy for Online Learning. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:32-40 Available from https://proceedings.mlr.press/v70/agarwal17a.html.

Related Material