Robust and private stochastic linear bandits

Vasileios Charisopoulos; Hossein Esfandiari; Vahab Mirrokni

Robust and private stochastic linear bandits

Vasileios Charisopoulos, Hossein Esfandiari, Vahab Mirrokni

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:4096-4115, 2023.

Abstract

In this paper, we study the stochastic linear bandit problem under the additional requirements of differential privacy, robustness and batched observations. In particular, we assume an adversary randomly chooses a constant fraction of the observed rewards in each batch, replacing them with arbitrary numbers. We present differentially private and robust variants of the arm elimination algorithm using logarithmic batch queries under two privacy models and provide regret bounds in both settings. In the first model, every reward in each round is reported by a potentially different client, which reduces to standard local differential privacy (LDP). In the second model, every action is "owned" by a different client, who may aggregate the rewards over multiple queries and privatize the aggregate response instead. To the best of our knowledge, our algorithms are the first simultaneously providing differential privacy and adversarial robustness in the stochastic linear bandits problem.

Cite this Paper

BibTeX

@InProceedings{pmlr-v202-charisopoulos23a,
  title = 	 {Robust and private stochastic linear bandits},
  author =       {Charisopoulos, Vasileios and Esfandiari, Hossein and Mirrokni, Vahab},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {4096--4115},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/charisopoulos23a/charisopoulos23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/charisopoulos23a.html},
  abstract = 	 {In this paper, we study the stochastic linear bandit problem under the additional requirements of differential privacy, robustness and batched observations. In particular, we assume an adversary randomly chooses a constant fraction of the observed rewards in each batch, replacing them with arbitrary numbers. We present differentially private and robust variants of the arm elimination algorithm using logarithmic batch queries under two privacy models and provide regret bounds in both settings. In the first model, every reward in each round is reported by a potentially different client, which reduces to standard local differential privacy (LDP). In the second model, every action is "owned" by a different client, who may aggregate the rewards over multiple queries and privatize the aggregate response instead. To the best of our knowledge, our algorithms are the first simultaneously providing differential privacy and adversarial robustness in the stochastic linear bandits problem.}
}

Endnote

%0 Conference Paper
%T Robust and private stochastic linear bandits
%A Vasileios Charisopoulos
%A Hossein Esfandiari
%A Vahab Mirrokni
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-charisopoulos23a
%I PMLR
%P 4096--4115
%U https://proceedings.mlr.press/v202/charisopoulos23a.html
%V 202
%X In this paper, we study the stochastic linear bandit problem under the additional requirements of differential privacy, robustness and batched observations. In particular, we assume an adversary randomly chooses a constant fraction of the observed rewards in each batch, replacing them with arbitrary numbers. We present differentially private and robust variants of the arm elimination algorithm using logarithmic batch queries under two privacy models and provide regret bounds in both settings. In the first model, every reward in each round is reported by a potentially different client, which reduces to standard local differential privacy (LDP). In the second model, every action is "owned" by a different client, who may aggregate the rewards over multiple queries and privatize the aggregate response instead. To the best of our knowledge, our algorithms are the first simultaneously providing differential privacy and adversarial robustness in the stochastic linear bandits problem.

APA

Charisopoulos, V., Esfandiari, H. & Mirrokni, V.. (2023). Robust and private stochastic linear bandits. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:4096-4115 Available from https://proceedings.mlr.press/v202/charisopoulos23a.html.

Robust and private stochastic linear bandits

Abstract

Cite this Paper

Related Material