Multi-Player Bandits – a Musical Chairs Approach

Jonathan Rosenski; Ohad Shamir; Liran Szlak

Multi-Player Bandits – a Musical Chairs Approach

Jonathan Rosenski, Ohad Shamir, Liran Szlak

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:155-163, 2016.

Abstract

We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-rosenski16,
  title = 	 {Multi-Player Bandits -- a Musical Chairs Approach},
  author = 	 {Rosenski, Jonathan and Shamir, Ohad and Szlak, Liran},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {155--163},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/rosenski16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/rosenski16.html},
  abstract = 	 {We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees.}
}

Endnote

%0 Conference Paper
%T Multi-Player Bandits – a Musical Chairs Approach
%A Jonathan Rosenski
%A Ohad Shamir
%A Liran Szlak
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-rosenski16
%I PMLR
%P 155--163
%U https://proceedings.mlr.press/v48/rosenski16.html
%V 48
%X We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees.

RIS


TY  - CPAPER
TI  - Multi-Player Bandits – a Musical Chairs Approach
AU  - Jonathan Rosenski
AU  - Ohad Shamir
AU  - Liran Szlak
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-rosenski16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 155
EP  - 163
L1  - http://proceedings.mlr.press/v48/rosenski16.pdf
UR  - https://proceedings.mlr.press/v48/rosenski16.html
AB  - We consider a variant of the stochastic multi-armed bandit problem, where multiple players simultaneously choose from the same set of arms and may collide, receiving no reward. This setting has been motivated by problems arising in cognitive radio networks, and is especially challenging under the realistic assumption that communication between players is limited. We provide a communication-free algorithm (Musical Chairs) which attains constant regret with high probability, as well as a sublinear-regret, communication-free algorithm (Dynamic Musical Chairs) for the more difficult setting of players dynamically entering and leaving throughout the game. Moreover, both algorithms do not require prior knowledge of the number of players. To the best of our knowledge, these are the first communication-free algorithms with these types of formal guarantees.
ER  -

APA


Rosenski, J., Shamir, O. & Szlak, L.. (2016). Multi-Player Bandits – a Musical Chairs Approach. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:155-163 Available from https://proceedings.mlr.press/v48/rosenski16.html.

Multi-Player Bandits – a Musical Chairs Approach

Abstract

Cite this Paper

Related Material