Does Sparsity Help in Learning Misspecified Linear Bandits?

Jialin Dong; Lin Yang

Does Sparsity Help in Learning Misspecified Linear Bandits?

Jialin Dong, Lin Yang

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:8317-8333, 2023.

Abstract

Recently, the study of linear misspecified bandits has generated intriguing implications of the hardness of learning in bandits and reinforcement learning (RL). In particular, Du et al. (2020) shows that even if a learner is given linear features in

$\mathbb{R}^d$ that approximate the rewards in a bandit or RL with a uniform error of

$\varepsilon$ , searching for an

$O(\varepsilon)$ -optimal action requires pulling at least

$\Omega(\exp(d))$ queries. Furthermore, Lattimore et al. (2020) show that a degraded

$O(\varepsilon\sqrt{d})$ -optimal solution can be learned within

$\operatorname{poly}(d/\varepsilon)$ queries. Yet it is unknown whether a structural assumption on the ground-truth parameter, such as sparsity, could break

$\varepsilon\sqrt{d}$ barrier. In this paper, we address this question by showing that algorithms can obtain

$O(\varepsilon)$ -optimal actions by querying

$\tilde{O}(\exp(m\varepsilon))$ actions, where

$m$ is the sparsity parameter, removing the

$\exp(d)$ -dependence. We further show (with an information-theoretical lower bound) that this is the best possible if one demands an error

$m^{\delta}\varepsilon$ for

$0<\delta<1$ . We further show that

$\operatorname{poly}(m/\varepsilon)$ bounds are possible when the linear features are "good”. These results provide a nearly complete picture of how sparsity can help in misspecified bandit learning and provide a deeper understanding of when linear features are “useful” for bandit and reinforcement learning with misspecification.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-dong23g,
  title = 	 {Does Sparsity Help in Learning Misspecified Linear Bandits?},
  author =       {Dong, Jialin and Yang, Lin},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {8317--8333},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/dong23g/dong23g.pdf},
  url = 	 {https://proceedings.mlr.press/v202/dong23g.html},
  abstract = 	 {Recently, the study of linear misspecified bandits has generated intriguing implications of the hardness of learning in bandits and reinforcement learning (RL). In particular, Du et al. (2020) shows that even if a learner is given linear features in $\mathbb{R}^d$ that approximate the rewards in a bandit or RL with a uniform error of $\varepsilon$, searching for an $O(\varepsilon)$-optimal action requires pulling at least $\Omega(\exp(d))$ queries. Furthermore, Lattimore et al. (2020) show that a degraded $O(\varepsilon\sqrt{d})$-optimal solution can be learned within $\operatorname{poly}(d/\varepsilon)$ queries. Yet it is unknown whether a structural assumption on the ground-truth parameter, such as sparsity, could break $\varepsilon\sqrt{d}$ barrier. In this paper, we address this question by showing that algorithms can obtain $O(\varepsilon)$-optimal actions by querying $\tilde{O}(\exp(m\varepsilon))$ actions, where $m$ is the sparsity parameter, removing the $\exp(d)$-dependence. We further show (with an information-theoretical lower bound) that this is the best possible if one demands an error $ m^{\delta}\varepsilon$ for $0<\delta<1$. We further show that $\operatorname{poly}(m/\varepsilon)$ bounds are possible when the linear features are "good”. These results provide a nearly complete picture of how sparsity can help in misspecified bandit learning and provide a deeper understanding of when linear features are “useful” for bandit and reinforcement learning with misspecification.}
}

Endnote

%0 Conference Paper
%T Does Sparsity Help in Learning Misspecified Linear Bandits?
%A Jialin Dong
%A Lin Yang
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-dong23g
%I PMLR
%P 8317--8333
%U https://proceedings.mlr.press/v202/dong23g.html
%V 202
%X Recently, the study of linear misspecified bandits has generated intriguing implications of the hardness of learning in bandits and reinforcement learning (RL). In particular, Du et al. (2020) shows that even if a learner is given linear features in $\mathbb{R}^d$ that approximate the rewards in a bandit or RL with a uniform error of $\varepsilon$, searching for an $O(\varepsilon)$-optimal action requires pulling at least $\Omega(\exp(d))$ queries. Furthermore, Lattimore et al. (2020) show that a degraded $O(\varepsilon\sqrt{d})$-optimal solution can be learned within $\operatorname{poly}(d/\varepsilon)$ queries. Yet it is unknown whether a structural assumption on the ground-truth parameter, such as sparsity, could break $\varepsilon\sqrt{d}$ barrier. In this paper, we address this question by showing that algorithms can obtain $O(\varepsilon)$-optimal actions by querying $\tilde{O}(\exp(m\varepsilon))$ actions, where $m$ is the sparsity parameter, removing the $\exp(d)$-dependence. We further show (with an information-theoretical lower bound) that this is the best possible if one demands an error $ m^{\delta}\varepsilon$ for $0<\delta<1$. We further show that $\operatorname{poly}(m/\varepsilon)$ bounds are possible when the linear features are "good”. These results provide a nearly complete picture of how sparsity can help in misspecified bandit learning and provide a deeper understanding of when linear features are “useful” for bandit and reinforcement learning with misspecification.

APA


Dong, J. & Yang, L.. (2023). Does Sparsity Help in Learning Misspecified Linear Bandits?. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:8317-8333 Available from https://proceedings.mlr.press/v202/dong23g.html.

Does Sparsity Help in Learning Misspecified Linear Bandits?

Abstract

Cite this Paper

Related Material