Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

Alina Beygelzimer; Francesco Orabona; Chicheng Zhang

Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

Alina Beygelzimer, Francesco Orabona, Chicheng Zhang

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:488-497, 2017.

Abstract

We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-beygelzimer17a,
  title = 	 {Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret},
  author =       {Alina Beygelzimer and Francesco Orabona and Chicheng Zhang},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {488--497},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/beygelzimer17a/beygelzimer17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/beygelzimer17a.html},
  abstract = 	 {We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.}
}

Endnote

%0 Conference Paper
%T Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret
%A Alina Beygelzimer
%A Francesco Orabona
%A Chicheng Zhang
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-beygelzimer17a
%I PMLR
%P 488--497
%U https://proceedings.mlr.press/v70/beygelzimer17a.html
%V 70
%X We present an efficient second-order algorithm with $\tilde{O}(1/\eta \sqrt{T})$ regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by $\eta$, ranging from hinge loss ($\eta=0$) to squared hinge loss ($\eta=1$). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for $\sqrt{T}$-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.

APA


Beygelzimer, A., Orabona, F. & Zhang, C.. (2017). Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:488-497 Available from https://proceedings.mlr.press/v70/beygelzimer17a.html.

Efficient Online Bandit Multiclass Learning with $\tilde{O}(\sqrt{T})$ Regret

Abstract

Cite this Paper

Related Material