Adaptive Sample Sharing for Multi Agent Linear Bandits

Hamza Cherkaoui, Merwan Barlier, Igor Colin
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:10216-10248, 2025.

Abstract

The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most existing approaches, our contribution does not rely on any assumptions on the bandit parameters structure. Our main result formalizes the trade-off between the bias and uncertainty of the bandit parameter estimation for efficient collaboration. This result is the cornerstone of the Bandit Adaptive Sample Sharing (BASS) algorithm, whose efficiency over the current state-of-the-art is validated through both theoretical analysis and empirical evaluations on both synthetic and real-world datasets. Furthermore, we demonstrate that, when agents’ parameters display a cluster structure, our algorithm accurately recovers them.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-cherkaoui25a, title = {Adaptive Sample Sharing for Multi Agent Linear Bandits}, author = {Cherkaoui, Hamza and Barlier, Merwan and Colin, Igor}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {10216--10248}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/cherkaoui25a/cherkaoui25a.pdf}, url = {https://proceedings.mlr.press/v267/cherkaoui25a.html}, abstract = {The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most existing approaches, our contribution does not rely on any assumptions on the bandit parameters structure. Our main result formalizes the trade-off between the bias and uncertainty of the bandit parameter estimation for efficient collaboration. This result is the cornerstone of the Bandit Adaptive Sample Sharing (BASS) algorithm, whose efficiency over the current state-of-the-art is validated through both theoretical analysis and empirical evaluations on both synthetic and real-world datasets. Furthermore, we demonstrate that, when agents’ parameters display a cluster structure, our algorithm accurately recovers them.} }
Endnote
%0 Conference Paper %T Adaptive Sample Sharing for Multi Agent Linear Bandits %A Hamza Cherkaoui %A Merwan Barlier %A Igor Colin %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-cherkaoui25a %I PMLR %P 10216--10248 %U https://proceedings.mlr.press/v267/cherkaoui25a.html %V 267 %X The multi-agent linear bandit setting is a well-known setting for which designing efficient collaboration between agents remains challenging. This paper studies the impact of data sharing among agents on regret minimization. Unlike most existing approaches, our contribution does not rely on any assumptions on the bandit parameters structure. Our main result formalizes the trade-off between the bias and uncertainty of the bandit parameter estimation for efficient collaboration. This result is the cornerstone of the Bandit Adaptive Sample Sharing (BASS) algorithm, whose efficiency over the current state-of-the-art is validated through both theoretical analysis and empirical evaluations on both synthetic and real-world datasets. Furthermore, we demonstrate that, when agents’ parameters display a cluster structure, our algorithm accurately recovers them.
APA
Cherkaoui, H., Barlier, M. & Colin, I.. (2025). Adaptive Sample Sharing for Multi Agent Linear Bandits. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:10216-10248 Available from https://proceedings.mlr.press/v267/cherkaoui25a.html.

Related Material