Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

Alina Beygelzimer, Francesco Orabona, Chicheng Zhang
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:488-497, 2017.

Abstract

We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-beygelzimer17a, title = {Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret}, author = {Alina Beygelzimer and Francesco Orabona and Chicheng Zhang}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {488--497}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/beygelzimer17a/beygelzimer17a.pdf}, url = {https://proceedings.mlr.press/v70/beygelzimer17a.html}, abstract = {We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.} }
Endnote
%0 Conference Paper %T Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret %A Alina Beygelzimer %A Francesco Orabona %A Chicheng Zhang %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-beygelzimer17a %I PMLR %P 488--497 %U https://proceedings.mlr.press/v70/beygelzimer17a.html %V 70 %X We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.
APA
Beygelzimer, A., Orabona, F. & Zhang, C.. (2017). Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:488-497 Available from https://proceedings.mlr.press/v70/beygelzimer17a.html.

Related Material