# Batched Bandit Problems

[edit]

Vianney Perchet,
Philippe Rigollet,
Sylvain Chassang,
Erik Snowberg
;

Proceedings of The 28th Conference on Learning Theory, PMLR 40:1456-1456, 2015.

#### Abstract

Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic multi-armed bandits under the constraint that the employed policy must split trials into a small number of batches. Our results show that a very small number of batches gives already close to minimax optimal regret bounds and we also evaluate the number of trials in each batch. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

#### Related Material