Feasible Arm Identification
[edit]
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:25352543, 2018.
Abstract
We introduce the feasible arm identification problem, a pure exploration multiarmed bandit problem where the agent is given a set of $D$dimensional arms and a polyhedron $P = \{x : A x \leq b \} \subset R^D$. Pulling an arm gives a random vector and the goal is to determine, using a fixed budget of $T$ pulls, which of the arms have means belonging to $P$. We propose three algorithms MDUCBE, MDSAR, and MDAPT and provide a unified analysis establishing upper bounds for each of them. We also establish a lower bound that matches up to constants the upper bounds of MDUCBE and MDAPT. Finally, we demonstrate the effectiveness of our algorithms on synthetic and realworld datasets.
Related Material


