Online Newton Method for Bandit Convex Optimisation Extended Abstract

Hidde Fokkema; Dirk Van der Hoeven; Tor Lattimore; Jack J. Mayo

Online Newton Method for Bandit Convex Optimisation Extended Abstract

Hidde Fokkema, Dirk Van der Hoeven, Tor Lattimore, Jack J. Mayo

Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:1713-1714, 2024.

Abstract

We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation and prove that in the adversarial setting its regret is at most

$d^{3.5} \sqrt{n} \mathrm{polylog}(n, d)$ with high probability where

$d$ is the dimension and

$n$ is the time horizon. In the stochastic setting the bound improves to

$M d^{2} \sqrt{n} \mathrm{polylog}(n, d)$ where

$M \in [d^{-1/2}, d^{-1/4}]$ is a constant that depends on the geometry of the constraint set and the desired computational properties.

Cite this Paper

BibTeX


@InProceedings{pmlr-v247-fokkema24a,
  title = 	 {Online Newton Method for Bandit Convex Optimisation Extended Abstract},
  author =       {Fokkema, Hidde and Van der Hoeven, Dirk and Lattimore, Tor and J. Mayo, Jack},
  booktitle = 	 {Proceedings of Thirty Seventh Conference on Learning Theory},
  pages = 	 {1713--1714},
  year = 	 {2024},
  editor = 	 {Agrawal, Shipra and Roth, Aaron},
  volume = 	 {247},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v247/fokkema24a/fokkema24a.pdf},
  url = 	 {https://proceedings.mlr.press/v247/fokkema24a.html},
  abstract = 	 {We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation and prove that in the adversarial setting its regret is at most $d^{3.5} \sqrt{n} \mathrm{polylog}(n, d)$ with high probability where $d$ is the dimension and $n$ is the time horizon.  In the stochastic setting the bound improves to $M d^{2} \sqrt{n} \mathrm{polylog}(n, d)$ where $M \in [d^{-1/2}, d^{-1/4}]$ is a constant that depends on the geometry of the constraint set and the desired computational properties.}
}

Endnote

%0 Conference Paper
%T Online Newton Method for Bandit Convex Optimisation Extended Abstract
%A Hidde Fokkema
%A Dirk Van der Hoeven
%A Tor Lattimore
%A Jack J. Mayo
%B Proceedings of Thirty Seventh Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2024
%E Shipra Agrawal
%E Aaron Roth	
%F pmlr-v247-fokkema24a
%I PMLR
%P 1713--1714
%U https://proceedings.mlr.press/v247/fokkema24a.html
%V 247
%X We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation and prove that in the adversarial setting its regret is at most $d^{3.5} \sqrt{n} \mathrm{polylog}(n, d)$ with high probability where $d$ is the dimension and $n$ is the time horizon.  In the stochastic setting the bound improves to $M d^{2} \sqrt{n} \mathrm{polylog}(n, d)$ where $M \in [d^{-1/2}, d^{-1/4}]$ is a constant that depends on the geometry of the constraint set and the desired computational properties.

APA


Fokkema, H., Van der Hoeven, D., Lattimore, T. & J. Mayo, J.. (2024). Online Newton Method for Bandit Convex Optimisation Extended Abstract. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:1713-1714 Available from https://proceedings.mlr.press/v247/fokkema24a.html.

Related Material

Download PDF