QuACK: A Multipurpose Queuing Algorithm for Cooperative $k$-Armed Bandits

Benjamin Howson, Sarah Lucie Filippi, Ciara Pike-Burke
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:1927-1935, 2025.

Abstract

This paper studies the cooperative stochastic $k$-armed bandit problem, where $m$ agents collaborate to identify the optimal action. Rather than adapting a specific single-agent algorithm, we propose a general-purpose black-box reduction that extends any single-agent algorithm to the multi-agent setting. Under mild assumptions, we prove that our black-box approach preserves the regret guarantees of the chosen algorithm, and is capable of achieving minimax-optimality up to an additive graph-dependent term. Our method applies to various bandit settings, including heavy-tailed and duelling bandits, and those with local differential privacy. Empirically, it is competitive with or outperforms specialized multi-agent algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-howson25a, title = {QuACK: A Multipurpose Queuing Algorithm for Cooperative $k$-Armed Bandits}, author = {Howson, Benjamin and Filippi, Sarah Lucie and Pike-Burke, Ciara}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {1927--1935}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/howson25a/howson25a.pdf}, url = {https://proceedings.mlr.press/v258/howson25a.html}, abstract = {This paper studies the cooperative stochastic $k$-armed bandit problem, where $m$ agents collaborate to identify the optimal action. Rather than adapting a specific single-agent algorithm, we propose a general-purpose black-box reduction that extends any single-agent algorithm to the multi-agent setting. Under mild assumptions, we prove that our black-box approach preserves the regret guarantees of the chosen algorithm, and is capable of achieving minimax-optimality up to an additive graph-dependent term. Our method applies to various bandit settings, including heavy-tailed and duelling bandits, and those with local differential privacy. Empirically, it is competitive with or outperforms specialized multi-agent algorithms.} }
Endnote
%0 Conference Paper %T QuACK: A Multipurpose Queuing Algorithm for Cooperative $k$-Armed Bandits %A Benjamin Howson %A Sarah Lucie Filippi %A Ciara Pike-Burke %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-howson25a %I PMLR %P 1927--1935 %U https://proceedings.mlr.press/v258/howson25a.html %V 258 %X This paper studies the cooperative stochastic $k$-armed bandit problem, where $m$ agents collaborate to identify the optimal action. Rather than adapting a specific single-agent algorithm, we propose a general-purpose black-box reduction that extends any single-agent algorithm to the multi-agent setting. Under mild assumptions, we prove that our black-box approach preserves the regret guarantees of the chosen algorithm, and is capable of achieving minimax-optimality up to an additive graph-dependent term. Our method applies to various bandit settings, including heavy-tailed and duelling bandits, and those with local differential privacy. Empirically, it is competitive with or outperforms specialized multi-agent algorithms.
APA
Howson, B., Filippi, S.L. & Pike-Burke, C.. (2025). QuACK: A Multipurpose Queuing Algorithm for Cooperative $k$-Armed Bandits. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:1927-1935 Available from https://proceedings.mlr.press/v258/howson25a.html.

Related Material