Distributed Linear Bandits under Communication Constraints

Sudeep Salgia, Qing Zhao
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:29845-29875, 2023.

Abstract

We consider distributed linear bandits where $M$ agents learn collaboratively to minimize the overall cumulative regret incurred by all agents. Information exchange is facilitated by a central server, and both the uplink and downlink communications are carried over channels with fixed capacity, which limits the amount of information that can be transmitted in each use of the channels. We investigate the regret-communication trade-off by (i) establishing information-theoretic lower bounds on the required communications (in terms of bits) for achieving a sublinear regret order; (ii) developing an efficient algorithm that achieves the minimum sublinear regret order offered by centralized learning using the minimum order of communications dictated by the information-theoretic lower bounds. For sparse linear bandits, we show a variant of the proposed algorithm offers better regret-communication trade-off by leveraging the sparsity of the problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-salgia23b, title = {Distributed Linear Bandits under Communication Constraints}, author = {Salgia, Sudeep and Zhao, Qing}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {29845--29875}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/salgia23b/salgia23b.pdf}, url = {https://proceedings.mlr.press/v202/salgia23b.html}, abstract = {We consider distributed linear bandits where $M$ agents learn collaboratively to minimize the overall cumulative regret incurred by all agents. Information exchange is facilitated by a central server, and both the uplink and downlink communications are carried over channels with fixed capacity, which limits the amount of information that can be transmitted in each use of the channels. We investigate the regret-communication trade-off by (i) establishing information-theoretic lower bounds on the required communications (in terms of bits) for achieving a sublinear regret order; (ii) developing an efficient algorithm that achieves the minimum sublinear regret order offered by centralized learning using the minimum order of communications dictated by the information-theoretic lower bounds. For sparse linear bandits, we show a variant of the proposed algorithm offers better regret-communication trade-off by leveraging the sparsity of the problem.} }
Endnote
%0 Conference Paper %T Distributed Linear Bandits under Communication Constraints %A Sudeep Salgia %A Qing Zhao %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-salgia23b %I PMLR %P 29845--29875 %U https://proceedings.mlr.press/v202/salgia23b.html %V 202 %X We consider distributed linear bandits where $M$ agents learn collaboratively to minimize the overall cumulative regret incurred by all agents. Information exchange is facilitated by a central server, and both the uplink and downlink communications are carried over channels with fixed capacity, which limits the amount of information that can be transmitted in each use of the channels. We investigate the regret-communication trade-off by (i) establishing information-theoretic lower bounds on the required communications (in terms of bits) for achieving a sublinear regret order; (ii) developing an efficient algorithm that achieves the minimum sublinear regret order offered by centralized learning using the minimum order of communications dictated by the information-theoretic lower bounds. For sparse linear bandits, we show a variant of the proposed algorithm offers better regret-communication trade-off by leveraging the sparsity of the problem.
APA
Salgia, S. & Zhao, Q.. (2023). Distributed Linear Bandits under Communication Constraints. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:29845-29875 Available from https://proceedings.mlr.press/v202/salgia23b.html.

Related Material