A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds

Jiahao He, Jiheng Zhang, Rachel Q. Zhang
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:8660-8677, 2022.

Abstract

Linear contextual bandits and their variants are usually solved using algorithms guided by parameter estimation. Cauchy-Schwartz inequality established that estimation errors dominate algorithm regrets, and thus, accurate estimators suffice to guarantee algorithms with low regrets. In this paper, we complete the reverse direction by establishing the necessity. In particular, we provide a generic transformation from algorithms for linear contextual bandits to estimators for linear models, and show that algorithm regrets dominate estimation errors of their induced estimators, i.e., low-regret algorithms must imply accurate estimators. Moreover, our analysis reduces the regret lower bound to an estimation error, bridging the lower bound analysis in linear contextual bandit problems and linear regression.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-he22e, title = {A Reduction from Linear Contextual Bandit Lower Bounds to Estimation Lower Bounds}, author = {He, Jiahao and Zhang, Jiheng and Zhang, Rachel Q.}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {8660--8677}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/he22e/he22e.pdf}, url = {https://proceedings.mlr.press/v162/he22e.html}, abstract = {Linear contextual bandits and their variants are usually solved using algorithms guided by parameter estimation. Cauchy-Schwartz inequality established that estimation errors dominate algorithm regrets, and thus, accurate estimators suffice to guarantee algorithms with low regrets. In this paper, we complete the reverse direction by establishing the necessity. In particular, we provide a generic transformation from algorithms for linear contextual bandits to estimators for linear models, and show that algorithm regrets dominate estimation errors of their induced estimators, i.e., low-regret algorithms must imply accurate estimators. Moreover, our analysis reduces the regret lower bound to an estimation error, bridging the lower bound analysis in linear contextual bandit problems and linear regression.} }
Endnote
%0 Conference Paper %T A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds %A Jiahao He %A Jiheng Zhang %A Rachel Q. Zhang %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-he22e %I PMLR %P 8660--8677 %U https://proceedings.mlr.press/v162/he22e.html %V 162 %X Linear contextual bandits and their variants are usually solved using algorithms guided by parameter estimation. Cauchy-Schwartz inequality established that estimation errors dominate algorithm regrets, and thus, accurate estimators suffice to guarantee algorithms with low regrets. In this paper, we complete the reverse direction by establishing the necessity. In particular, we provide a generic transformation from algorithms for linear contextual bandits to estimators for linear models, and show that algorithm regrets dominate estimation errors of their induced estimators, i.e., low-regret algorithms must imply accurate estimators. Moreover, our analysis reduces the regret lower bound to an estimation error, bridging the lower bound analysis in linear contextual bandit problems and linear regression.
APA
He, J., Zhang, J. & Zhang, R.Q.. (2022). A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:8660-8677 Available from https://proceedings.mlr.press/v162/he22e.html.

Related Material