Correlated bandits or: How to minimize mean-squared error online

Vinay Praneeth Boda, Prashanth L.A.
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:686-694, 2019.

Abstract

While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator based on sample variances/covariances and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minimax theory, we also derive fundamental performance limits for the correlated bandit problem.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-boda19a, title = {Correlated bandits or: How to minimize mean-squared error online}, author = {Boda, Vinay Praneeth and L.A., Prashanth}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {686--694}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/boda19a/boda19a.pdf}, url = {https://proceedings.mlr.press/v97/boda19a.html}, abstract = {While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator based on sample variances/covariances and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minimax theory, we also derive fundamental performance limits for the correlated bandit problem.} }
Endnote
%0 Conference Paper %T Correlated bandits or: How to minimize mean-squared error online %A Vinay Praneeth Boda %A Prashanth L.A. %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-boda19a %I PMLR %P 686--694 %U https://proceedings.mlr.press/v97/boda19a.html %V 97 %X While the objective in traditional multi-armed bandit problems is to find the arm with the highest mean, in many settings, finding an arm that best captures information about other arms is of interest. This objective, however, requires learning the underlying correlation structure and not just the means. Sensors placement for industrial surveillance and cellular network monitoring are a few applications, where the underlying correlation structure plays an important role. Motivated by such applications, we formulate the correlated bandit problem, where the objective is to find the arm with the lowest mean-squared error (MSE) in estimating all the arms. To this end, we derive first an MSE estimator based on sample variances/covariances and show that our estimator exponentially concentrates around the true MSE. Under a best-arm identification framework, we propose a successive rejects type algorithm and provide bounds on the probability of error in identifying the best arm. Using minimax theory, we also derive fundamental performance limits for the correlated bandit problem.
APA
Boda, V.P. & L.A., P.. (2019). Correlated bandits or: How to minimize mean-squared error online. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:686-694 Available from https://proceedings.mlr.press/v97/boda19a.html.

Related Material