Budgeted Bandit Problems with Continuous Random Costs

Yingce Xia, Wenkui Ding, Xu-Dong Zhang, Nenghai Yu, Tao Qin
Asian Conference on Machine Learning, PMLR 45:317-332, 2016.

Abstract

We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v45-Xia15, title = {Budgeted Bandit Problems with Continuous Random Costs}, author = {Xia, Yingce and Ding, Wenkui and Zhang, Xu-Dong and Yu, Nenghai and Qin, Tao}, booktitle = {Asian Conference on Machine Learning}, pages = {317--332}, year = {2016}, editor = {Holmes, Geoffrey and Liu, Tie-Yan}, volume = {45}, series = {Proceedings of Machine Learning Research}, address = {Hong Kong}, month = {20--22 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v45/Xia15.pdf}, url = {https://proceedings.mlr.press/v45/Xia15.html}, abstract = {We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms. } }
Endnote
%0 Conference Paper %T Budgeted Bandit Problems with Continuous Random Costs %A Yingce Xia %A Wenkui Ding %A Xu-Dong Zhang %A Nenghai Yu %A Tao Qin %B Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2016 %E Geoffrey Holmes %E Tie-Yan Liu %F pmlr-v45-Xia15 %I PMLR %P 317--332 %U https://proceedings.mlr.press/v45/Xia15.html %V 45 %X We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms.
RIS
TY - CPAPER TI - Budgeted Bandit Problems with Continuous Random Costs AU - Yingce Xia AU - Wenkui Ding AU - Xu-Dong Zhang AU - Nenghai Yu AU - Tao Qin BT - Asian Conference on Machine Learning DA - 2016/02/25 ED - Geoffrey Holmes ED - Tie-Yan Liu ID - pmlr-v45-Xia15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 45 SP - 317 EP - 332 L1 - http://proceedings.mlr.press/v45/Xia15.pdf UR - https://proceedings.mlr.press/v45/Xia15.html AB - We study the budgeted bandit problem, where each arm is associated with both a reward and a cost. In a budgeted bandit problem, the objective is to design an arm pulling algorithm in order to maximize the total reward before the budget runs out. In this work, we study both multi-armed bandits and linear bandits, and focus on the setting with continuous random costs. We propose an upper confidence bound based algorithm for multi-armed bandits and a confidence ball based algorithm for linear bandits, and prove logarithmic regret bounds for both algorithms. We conduct simulations on the proposed algorithms, which verify the effectiveness of our proposed algorithms. ER -
APA
Xia, Y., Ding, W., Zhang, X., Yu, N. & Qin, T.. (2016). Budgeted Bandit Problems with Continuous Random Costs. Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 45:317-332 Available from https://proceedings.mlr.press/v45/Xia15.html.

Related Material