[edit]

# Sparse Stochastic Bandits

*Proceedings of the 2017 Conference on Learning Theory*, PMLR 65:1269-1270, 2017.

#### Abstract

In the classical multi-armed bandit problem, $d$ arms are available to the decision maker who pulls them sequentially in order to maximize his cumulative reward. Guarantees can be obtained on a relative quantity called regret, which scales linearly with $d$ (or with $\sqrt{d}$ in the minimax sense). We here consider the \emphsparse case of this classical problem in the sense that only a small number of arms, namely $s

#### Cite this Paper

```
``````
```

#### Related Material

```
```

```
```