Better Algorithms for Stochastic Bandits with Adversarial Corruptions

[edit]

Anupam Gupta, Tomer Koren, Kunal Talwar ;
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:1562-1578, 2019.

Abstract

We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.

Related Material