Corrupt Bandits for Preserving Local Privacy

Pratik Gajane, Tanguy Urvoy, Emilie Kaufmann
Proceedings of Algorithmic Learning Theory, PMLR 83:387-412, 2018.

Abstract

We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower bound on the expected regret of any bandit algorithm in this corrupted setting. We devise a frequentist algorithm, KLUCB-CF, and a Bayesian algorithm, TS-CF and give upper bounds on their regret. We also provide the appropriate corruption parameters to guarantee a desired level of local privacy and analyze how this impacts the regret. Finally, we present some experimental results that confirm our analysis.

Cite this Paper


BibTeX
@InProceedings{pmlr-v83-gajane18a, title = {Corrupt Bandits for Preserving Local Privacy}, author = {Gajane, Pratik and Urvoy, Tanguy and Kaufmann, Emilie}, booktitle = {Proceedings of Algorithmic Learning Theory}, pages = {387--412}, year = {2018}, editor = {Janoos, Firdaus and Mohri, Mehryar and Sridharan, Karthik}, volume = {83}, series = {Proceedings of Machine Learning Research}, month = {07--09 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v83/gajane18a/gajane18a.pdf}, url = {https://proceedings.mlr.press/v83/gajane18a.html}, abstract = {We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower bound on the expected regret of any bandit algorithm in this corrupted setting. We devise a frequentist algorithm, KLUCB-CF, and a Bayesian algorithm, TS-CF and give upper bounds on their regret. We also provide the appropriate corruption parameters to guarantee a desired level of local privacy and analyze how this impacts the regret. Finally, we present some experimental results that confirm our analysis.} }
Endnote
%0 Conference Paper %T Corrupt Bandits for Preserving Local Privacy %A Pratik Gajane %A Tanguy Urvoy %A Emilie Kaufmann %B Proceedings of Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2018 %E Firdaus Janoos %E Mehryar Mohri %E Karthik Sridharan %F pmlr-v83-gajane18a %I PMLR %P 387--412 %U https://proceedings.mlr.press/v83/gajane18a.html %V 83 %X We study a variant of the stochastic multi-armed bandit (MAB) problem in which the rewards are corrupted. In this framework, motivated by privacy preservation in online recommender systems, the goal is to maximize the sum of the (unobserved) rewards, based on the observation of transformation of these rewards through a stochastic corruption process with known parameters. We provide a lower bound on the expected regret of any bandit algorithm in this corrupted setting. We devise a frequentist algorithm, KLUCB-CF, and a Bayesian algorithm, TS-CF and give upper bounds on their regret. We also provide the appropriate corruption parameters to guarantee a desired level of local privacy and analyze how this impacts the regret. Finally, we present some experimental results that confirm our analysis.
APA
Gajane, P., Urvoy, T. & Kaufmann, E.. (2018). Corrupt Bandits for Preserving Local Privacy. Proceedings of Algorithmic Learning Theory, in Proceedings of Machine Learning Research 83:387-412 Available from https://proceedings.mlr.press/v83/gajane18a.html.

Related Material