Online Learning with Abstention
[edit]
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:10671075, 2018.
Abstract
We present an extensive study of a key problem in online learning where the learner can opt to abstain from making a prediction, at a certain cost. In the adversarial setting, we show how existing online algorithms and guarantees can be adapted to this problem. In the stochastic setting, we first point out a bias problem that limits the straightforward extension of algorithms such as UCBN to this context. Next, we give a new algorithm, UCBGT, that exploits historical data and timevarying feedback graphs. We show that this algorithm benefits from more favorable regret guarantees than a natural extension of UCBN . We further report the results of a series of experiments demonstrating that UCBGT largely outperforms that extension of UCBN, as well as other standard baselines.
Related Material


