Efficient Kernelized UCB for Contextual Bandits

Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:5689-5720, 2022.

Abstract

In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a $\mathcal{O}(CT^3)$ complexity where $T$ is the horizon and the constant $C$ is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nyström approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of $\mathcal{O}(CTm^2)$ where $m$ is the number of Nyström points. To recover the same regret as the standard kernelized UCB algorithm, $m$ needs to be of order of the effective dimension of the problem, which is at most $\mathcal{O}(\sqrt{T})$ and nearly constant in some cases.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-zenati22a, title = { Efficient Kernelized UCB for Contextual Bandits }, author = {Zenati, Houssam and Bietti, Alberto and Diemert, Eustache and Mairal, Julien and Martin, Matthieu and Gaillard, Pierre}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {5689--5720}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/zenati22a/zenati22a.pdf}, url = {https://proceedings.mlr.press/v151/zenati22a.html}, abstract = { In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a $\mathcal{O}(CT^3)$ complexity where $T$ is the horizon and the constant $C$ is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nyström approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of $\mathcal{O}(CTm^2)$ where $m$ is the number of Nyström points. To recover the same regret as the standard kernelized UCB algorithm, $m$ needs to be of order of the effective dimension of the problem, which is at most $\mathcal{O}(\sqrt{T})$ and nearly constant in some cases. } }
Endnote
%0 Conference Paper %T Efficient Kernelized UCB for Contextual Bandits %A Houssam Zenati %A Alberto Bietti %A Eustache Diemert %A Julien Mairal %A Matthieu Martin %A Pierre Gaillard %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-zenati22a %I PMLR %P 5689--5720 %U https://proceedings.mlr.press/v151/zenati22a.html %V 151 %X In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a $\mathcal{O}(CT^3)$ complexity where $T$ is the horizon and the constant $C$ is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nyström approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of $\mathcal{O}(CTm^2)$ where $m$ is the number of Nyström points. To recover the same regret as the standard kernelized UCB algorithm, $m$ needs to be of order of the effective dimension of the problem, which is at most $\mathcal{O}(\sqrt{T})$ and nearly constant in some cases.
APA
Zenati, H., Bietti, A., Diemert, E., Mairal, J., Martin, M. & Gaillard, P.. (2022). Efficient Kernelized UCB for Contextual Bandits . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:5689-5720 Available from https://proceedings.mlr.press/v151/zenati22a.html.

Related Material