(Bandit) Convex Optimization with Biased Noisy Gradient Oracles

Xiaowei Hu; Prashanth L.A.; András György; Csaba Szepesvari

(Bandit) Convex Optimization with Biased Noisy Gradient Oracles

Xiaowei Hu, Prashanth L.A., András György, Csaba Szepesvari

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:819-828, 2016.

Abstract

A popular class of algorithms for convex optimization and online learning with bandit feedback rely on constructing noisy gradient estimates, which are then used in place of the actual gradients in appropriately adjusted first-order algorithms. Depending on the properties of the function to be optimized and the nature of “noise” in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. In this paper we propose a novel framework that replaces the specific gradient estimation methods with an abstract oracle model. With the help of the new framework we unify previous works, reproducing their results in a clean and concise fashion, while, perhaps more importantly, the framework also allows us to formally show that to achieve the optimal root-n rate either the algorithms that use existing gradient estimators, or the proofs have to go beyond what exists today.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-hu16b,
  title = 	 {(Bandit) Convex Optimization with Biased Noisy Gradient Oracles},
  author = 	 {Hu, Xiaowei and L.A., Prashanth and György, András and Szepesvari, Csaba},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {819--828},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/hu16b.pdf},
  url = 	 {https://proceedings.mlr.press/v51/hu16b.html},
  abstract = 	 {A popular class of algorithms for convex optimization and online learning with bandit feedback rely on constructing noisy gradient estimates, which are then used in place of the actual gradients in appropriately adjusted first-order algorithms. Depending on the properties of the function to be optimized and the nature of “noise” in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. In this paper we propose a novel framework that replaces the specific gradient estimation methods with an abstract oracle model. With the help of the new framework we unify previous works, reproducing their results in a clean and concise fashion, while, perhaps more importantly, the framework also allows us to formally show that to achieve the optimal root-n rate either the algorithms that use existing gradient estimators, or the proofs have to go beyond what exists today.}
}

Endnote

%0 Conference Paper
%T (Bandit) Convex Optimization with Biased Noisy Gradient Oracles
%A Xiaowei Hu
%A Prashanth L.A.
%A András György
%A Csaba Szepesvari
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-hu16b
%I PMLR
%P 819--828
%U https://proceedings.mlr.press/v51/hu16b.html
%V 51
%X A popular class of algorithms for convex optimization and online learning with bandit feedback rely on constructing noisy gradient estimates, which are then used in place of the actual gradients in appropriately adjusted first-order algorithms. Depending on the properties of the function to be optimized and the nature of “noise” in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. In this paper we propose a novel framework that replaces the specific gradient estimation methods with an abstract oracle model. With the help of the new framework we unify previous works, reproducing their results in a clean and concise fashion, while, perhaps more importantly, the framework also allows us to formally show that to achieve the optimal root-n rate either the algorithms that use existing gradient estimators, or the proofs have to go beyond what exists today.

RIS


TY  - CPAPER
TI  - (Bandit) Convex Optimization with Biased Noisy Gradient Oracles
AU  - Xiaowei Hu
AU  - Prashanth L.A.
AU  - András György
AU  - Csaba Szepesvari
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-hu16b
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 819
EP  - 828
L1  - http://proceedings.mlr.press/v51/hu16b.pdf
UR  - https://proceedings.mlr.press/v51/hu16b.html
AB  - A popular class of algorithms for convex optimization and online learning with bandit feedback rely on constructing noisy gradient estimates, which are then used in place of the actual gradients in appropriately adjusted first-order algorithms. Depending on the properties of the function to be optimized and the nature of “noise” in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. In this paper we propose a novel framework that replaces the specific gradient estimation methods with an abstract oracle model. With the help of the new framework we unify previous works, reproducing their results in a clean and concise fashion, while, perhaps more importantly, the framework also allows us to formally show that to achieve the optimal root-n rate either the algorithms that use existing gradient estimators, or the proofs have to go beyond what exists today.
ER  -

APA


Hu, X., L.A., P., György, A. & Szepesvari, C.. (2016). (Bandit) Convex Optimization with Biased Noisy Gradient Oracles. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:819-828 Available from https://proceedings.mlr.press/v51/hu16b.html.

Related Material

Download PDF