Collaborative Exploration in Stochastic Multi-Player Bandits

Hiba Dakdouk, Raphaël Féraud, Nadège Varsier, Patrick Maillé
Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:193-208, 2020.

Abstract

Internet of Things (IoT) faces multiple challenges to achieve high reliability, low-latency and low power consumption. Its performance is affected by many factors such as external interference coming from other coexisting wireless communication technologies that are sharing the same spectrum. To address this problem, we introduce a general approach for the identification of poor-link quality channels. We formulate our problem as a multi-player multi-armed bandit problem, where the devices in an IoT network are the players, and the arms are the radio channels. For a realistic formulation, we do not assume that sensing information is available or that the number of players is below the number of arms. We develop and analyze a collaborative decentralized algorithm that aims to find a set of $m$ $(\epsilon,m)$-optimal arms using an Explore-$m$ algorithm (as denoted by Kalyanakrishnan and Stone (2010)) as a subroutine, and hence blacklisting the suboptimal arms in order to improve the QoS of IoT networks while reducing their energy consumption. We prove analytically and experimentally that our algorithm outperforms selfish algorithms in terms of sample complexity with a low communication cost, and that although playing a smaller set of arms increases the collision rate, playing the optimal arms only improves the QoS of the network.

Cite this Paper


BibTeX
@InProceedings{pmlr-v129-dakdouk20a, title = {Collaborative Exploration in Stochastic Multi-Player Bandits}, author = {Dakdouk, Hiba and F{\'e}raud, Rapha{\"e}l and Varsier, Nad{\`e}ge and Maill{\'e}, Patrick}, booktitle = {Proceedings of The 12th Asian Conference on Machine Learning}, pages = {193--208}, year = {2020}, editor = {Pan, Sinno Jialin and Sugiyama, Masashi}, volume = {129}, series = {Proceedings of Machine Learning Research}, month = {18--20 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v129/dakdouk20a/dakdouk20a.pdf}, url = {https://proceedings.mlr.press/v129/dakdouk20a.html}, abstract = {Internet of Things (IoT) faces multiple challenges to achieve high reliability, low-latency and low power consumption. Its performance is affected by many factors such as external interference coming from other coexisting wireless communication technologies that are sharing the same spectrum. To address this problem, we introduce a general approach for the identification of poor-link quality channels. We formulate our problem as a multi-player multi-armed bandit problem, where the devices in an IoT network are the players, and the arms are the radio channels. For a realistic formulation, we do not assume that sensing information is available or that the number of players is below the number of arms. We develop and analyze a collaborative decentralized algorithm that aims to find a set of $m$ $(\epsilon,m)$-optimal arms using an Explore-$m$ algorithm (as denoted by Kalyanakrishnan and Stone (2010)) as a subroutine, and hence blacklisting the suboptimal arms in order to improve the QoS of IoT networks while reducing their energy consumption. We prove analytically and experimentally that our algorithm outperforms selfish algorithms in terms of sample complexity with a low communication cost, and that although playing a smaller set of arms increases the collision rate, playing the optimal arms only improves the QoS of the network.} }
Endnote
%0 Conference Paper %T Collaborative Exploration in Stochastic Multi-Player Bandits %A Hiba Dakdouk %A Raphaël Féraud %A Nadège Varsier %A Patrick Maillé %B Proceedings of The 12th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Sinno Jialin Pan %E Masashi Sugiyama %F pmlr-v129-dakdouk20a %I PMLR %P 193--208 %U https://proceedings.mlr.press/v129/dakdouk20a.html %V 129 %X Internet of Things (IoT) faces multiple challenges to achieve high reliability, low-latency and low power consumption. Its performance is affected by many factors such as external interference coming from other coexisting wireless communication technologies that are sharing the same spectrum. To address this problem, we introduce a general approach for the identification of poor-link quality channels. We formulate our problem as a multi-player multi-armed bandit problem, where the devices in an IoT network are the players, and the arms are the radio channels. For a realistic formulation, we do not assume that sensing information is available or that the number of players is below the number of arms. We develop and analyze a collaborative decentralized algorithm that aims to find a set of $m$ $(\epsilon,m)$-optimal arms using an Explore-$m$ algorithm (as denoted by Kalyanakrishnan and Stone (2010)) as a subroutine, and hence blacklisting the suboptimal arms in order to improve the QoS of IoT networks while reducing their energy consumption. We prove analytically and experimentally that our algorithm outperforms selfish algorithms in terms of sample complexity with a low communication cost, and that although playing a smaller set of arms increases the collision rate, playing the optimal arms only improves the QoS of the network.
APA
Dakdouk, H., Féraud, R., Varsier, N. & Maillé, P.. (2020). Collaborative Exploration in Stochastic Multi-Player Bandits. Proceedings of The 12th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 129:193-208 Available from https://proceedings.mlr.press/v129/dakdouk20a.html.

Related Material