Budgeted Bandit Problems with Continuous Random Costs

Yingce Xia; Wenkui Ding; Xu-Dong Zhang; Nenghai Yu; Tao Qin

Budgeted Bandit Problems with Continuous Random Costs

Yingce Xia, Wenkui Ding, Xu-Dong Zhang, Nenghai Yu, Tao Qin

Asian Conference on Machine Learning, PMLR 45:317-332, 2016.

Abstract

We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v45-Xia15,
  title = 	 {Budgeted Bandit Problems with Continuous Random Costs},
  author = 	 {Xia, Yingce and Ding, Wenkui and Zhang, Xu-Dong and Yu, Nenghai and Qin, Tao},
  booktitle = 	 {Asian Conference on Machine Learning},
  pages = 	 {317--332},
  year = 	 {2016},
  editor = 	 {Holmes, Geoffrey and Liu, Tie-Yan},
  volume = 	 {45},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Hong Kong},
  month = 	 {20--22 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v45/Xia15.pdf},
  url = 	 {https://proceedings.mlr.press/v45/Xia15.html},
  abstract = 	 {We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms. }
}

Endnote

%0 Conference Paper
%T Budgeted Bandit Problems with Continuous Random Costs
%A Yingce Xia
%A Wenkui Ding
%A Xu-Dong Zhang
%A Nenghai Yu
%A Tao Qin
%B Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Geoffrey Holmes
%E Tie-Yan Liu	
%F pmlr-v45-Xia15
%I PMLR
%P 317--332
%U https://proceedings.mlr.press/v45/Xia15.html
%V 45
%X We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms.

RIS


TY  - CPAPER
TI  - Budgeted Bandit Problems with Continuous Random Costs
AU  - Yingce Xia
AU  - Wenkui Ding
AU  - Xu-Dong Zhang
AU  - Nenghai Yu
AU  - Tao Qin
BT  - Asian Conference on Machine Learning
DA  - 2016/02/25
ED  - Geoffrey Holmes
ED  - Tie-Yan Liu	
ID  - pmlr-v45-Xia15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 45
SP  - 317
EP  - 332
L1  - http://proceedings.mlr.press/v45/Xia15.pdf
UR  - https://proceedings.mlr.press/v45/Xia15.html
AB  - We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms. 
ER  -

APA


Xia, Y., Ding, W., Zhang, X., Yu, N. & Qin, T.. (2016). Budgeted Bandit Problems with Continuous Random Costs. Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 45:317-332 Available from https://proceedings.mlr.press/v45/Xia15.html.

Related Material

Download PDF