[edit]
Monotone multi-armed bandit allocations
Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:829-834, 2011.
Abstract
We present a novel angle for multi-armed bandits (henceforth abbreviated MAB) which follows from the recent work on MAB mechanisms (Babaioff et al., 2009; Devanur and Kakade, 2009; Babaioff et al., 2010). The new problem is, essentially, about designing MAB algorithms under an additional constraint motivated by their application to MAB mechanisms. This note is self-contained, although some familiarity with MAB is assumed; we refer the reader to Cesa-Bianchi and Lugosi (2006) for more background.