Gossip-based distributed stochastic bandit algorithms

Balazs Szorenyi; Robert Busa-Fekete; Istvan Hegedus; Robert Ormandi; Mark Jelasity; Balazs Kegl

Gossip-based distributed stochastic bandit algorithms

Balazs Szorenyi, Robert Busa-Fekete, Istvan Hegedus, Robert Ormandi, Mark Jelasity, Balazs Kegl

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):19-27, 2013.

Abstract

The multi-armed bandit problem has attracted remarkable attention in the machine learning community and many efficient algorithms have been proposed to handle the so-called exploitation-exploration dilemma in various bandit setups. At the same time, significantly less effort has been devoted to adapting bandit algorithms to particular architectures, such as sensor networks, multi-core machines, or peer-to-peer (P2P) environments, which could potentially speed up their convergence. Our goal is to adapt stochastic bandit algorithms to P2P networks. In our setup, the same set of arms is available in each peer. In every iteration each peer can pull one arm independently of the other peers, and then some limited communication is possible with a few random other peers. As our main result, we show that our adaptation achieves a linear speedup in terms of the number of peers participating in the network. More precisely, we show that the probability of playing a suboptimal arm at a peer in iteration t = Ω( \log N ) is proportional to 1/(Nt) where N denotes the number of peers. The theoretical results are supported by simulation experiments showing that our algorithm scales gracefully with the size of network.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-szorenyi13,
  title = 	 {Gossip-based distributed stochastic bandit algorithms},
  author = 	 {Szorenyi, Balazs and Busa-Fekete, Robert and Hegedus, Istvan and Ormandi, Robert and Jelasity, Mark and Kegl, Balazs},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {19--27},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/szorenyi13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/szorenyi13.html},
  abstract = 	 {The multi-armed bandit problem has attracted remarkable attention in the machine learning community and many efficient algorithms have been proposed to handle the so-called exploitation-exploration dilemma in various bandit setups. At the same time, significantly less effort has been devoted to adapting bandit algorithms to particular architectures, such as sensor networks, multi-core machines, or peer-to-peer (P2P) environments, which could potentially speed up their convergence. Our goal is to adapt stochastic bandit algorithms to P2P networks.  In our setup, the same set of arms is available in each peer. In every iteration each peer can pull one arm independently of the other peers, and then some limited communication is possible with a few random other peers.  As our main result, we show that our adaptation achieves a linear speedup in terms of the number of peers participating in the network.  More precisely, we show that the probability of playing a suboptimal arm at a peer in iteration t = Ω( \log N ) is proportional to 1/(Nt) where N denotes the number of peers.  The theoretical results are supported by simulation experiments showing that our algorithm scales gracefully with the size of network.  }
}

Endnote

%0 Conference Paper
%T Gossip-based distributed stochastic bandit algorithms
%A Balazs Szorenyi
%A Robert Busa-Fekete
%A Istvan Hegedus
%A Robert Ormandi
%A Mark Jelasity
%A Balazs Kegl
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-szorenyi13
%I PMLR
%P 19--27
%U https://proceedings.mlr.press/v28/szorenyi13.html
%V 28
%N 3
%X The multi-armed bandit problem has attracted remarkable attention in the machine learning community and many efficient algorithms have been proposed to handle the so-called exploitation-exploration dilemma in various bandit setups. At the same time, significantly less effort has been devoted to adapting bandit algorithms to particular architectures, such as sensor networks, multi-core machines, or peer-to-peer (P2P) environments, which could potentially speed up their convergence. Our goal is to adapt stochastic bandit algorithms to P2P networks.  In our setup, the same set of arms is available in each peer. In every iteration each peer can pull one arm independently of the other peers, and then some limited communication is possible with a few random other peers.  As our main result, we show that our adaptation achieves a linear speedup in terms of the number of peers participating in the network.  More precisely, we show that the probability of playing a suboptimal arm at a peer in iteration t = Ω( \log N ) is proportional to 1/(Nt) where N denotes the number of peers.  The theoretical results are supported by simulation experiments showing that our algorithm scales gracefully with the size of network.

RIS


TY  - CPAPER
TI  - Gossip-based distributed stochastic bandit algorithms
AU  - Balazs Szorenyi
AU  - Robert Busa-Fekete
AU  - Istvan Hegedus
AU  - Robert Ormandi
AU  - Mark Jelasity
AU  - Balazs Kegl
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-szorenyi13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 19
EP  - 27
L1  - http://proceedings.mlr.press/v28/szorenyi13.pdf
UR  - https://proceedings.mlr.press/v28/szorenyi13.html
AB  - The multi-armed bandit problem has attracted remarkable attention in the machine learning community and many efficient algorithms have been proposed to handle the so-called exploitation-exploration dilemma in various bandit setups. At the same time, significantly less effort has been devoted to adapting bandit algorithms to particular architectures, such as sensor networks, multi-core machines, or peer-to-peer (P2P) environments, which could potentially speed up their convergence. Our goal is to adapt stochastic bandit algorithms to P2P networks.  In our setup, the same set of arms is available in each peer. In every iteration each peer can pull one arm independently of the other peers, and then some limited communication is possible with a few random other peers.  As our main result, we show that our adaptation achieves a linear speedup in terms of the number of peers participating in the network.  More precisely, we show that the probability of playing a suboptimal arm at a peer in iteration t = Ω( \log N ) is proportional to 1/(Nt) where N denotes the number of peers.  The theoretical results are supported by simulation experiments showing that our algorithm scales gracefully with the size of network.  
ER  -

APA


Szorenyi, B., Busa-Fekete, R., Hegedus, I., Ormandi, R., Jelasity, M. & Kegl, B.. (2013). Gossip-based distributed stochastic bandit algorithms. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):19-27 Available from https://proceedings.mlr.press/v28/szorenyi13.html.

Gossip-based distributed stochastic bandit algorithms

Abstract

Cite this Paper

Related Material