Sparse Nonparametric Contextual Bandits

Hamish Flynn; Julia Olkhovskaya; Paul Rognon-Vael

Sparse Nonparametric Contextual Bandits

Hamish Flynn, Julia Olkhovskaya, Paul Rognon-Vael

Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-44, 2026.

Abstract

We study the benefits of sparsity in nonparametric contextual bandit problems, in which the set of candidate features is countably or uncountably infinite. Our contribution is two-fold. First, using a novel reduction to sequences of multi-armed bandit problems, we provide lower bounds on the minimax regret, which show that polynomial dependence on the number of actions is generally unavoidable in this setting. Second, we show that a variant of the Feel-Good Thompson Sampling algorithm enjoys regret bounds that match our lower bounds up to logarithmic factors of the horizon, and have logarithmic dependence on the effective number of candidate features. When we apply our results to kernelised and neural contextual bandits, we find that sparsity enables better regret bounds whenever the horizon is large enough relative to the sparsity and the number of actions.

Cite this Paper

BibTeX

@InProceedings{pmlr-v313-flynn26a,
  title = 	 {Sparse Nonparametric Contextual Bandits},
  author =       {Flynn, Hamish and Olkhovskaya, Julia and Rognon-Vael, Paul},
  booktitle = 	 {Proceedings of The 37th International Conference on Algorithmic Learning Theory},
  pages = 	 {1--44},
  year = 	 {2026},
  editor = 	 {Telgarsky, Matus and Ullman, Jonathan},
  volume = 	 {313},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--26 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v313/main/assets/flynn26a/flynn26a.pdf},
  url = 	 {https://proceedings.mlr.press/v313/flynn26a.html},
  abstract = 	 {We study the benefits of sparsity in nonparametric contextual bandit problems, in which the set of candidate features is countably or uncountably infinite. Our contribution is two-fold. First, using a novel reduction to sequences of multi-armed bandit problems, we provide lower bounds on the minimax regret, which show that polynomial dependence on the number of actions is generally unavoidable in this setting. Second, we show that a variant of the Feel-Good Thompson Sampling algorithm enjoys regret bounds that match our lower bounds up to logarithmic factors of the horizon, and have logarithmic dependence on the effective number of candidate features. When we apply our results to kernelised and neural contextual bandits, we find that sparsity enables better regret bounds whenever the horizon is large enough relative to the sparsity and the number of actions.}
}

Endnote

%0 Conference Paper
%T Sparse Nonparametric Contextual Bandits
%A Hamish Flynn
%A Julia Olkhovskaya
%A Paul Rognon-Vael
%B Proceedings of The 37th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Matus Telgarsky
%E Jonathan Ullman	
%F pmlr-v313-flynn26a
%I PMLR
%P 1--44
%U https://proceedings.mlr.press/v313/flynn26a.html
%V 313
%X We study the benefits of sparsity in nonparametric contextual bandit problems, in which the set of candidate features is countably or uncountably infinite. Our contribution is two-fold. First, using a novel reduction to sequences of multi-armed bandit problems, we provide lower bounds on the minimax regret, which show that polynomial dependence on the number of actions is generally unavoidable in this setting. Second, we show that a variant of the Feel-Good Thompson Sampling algorithm enjoys regret bounds that match our lower bounds up to logarithmic factors of the horizon, and have logarithmic dependence on the effective number of candidate features. When we apply our results to kernelised and neural contextual bandits, we find that sparsity enables better regret bounds whenever the horizon is large enough relative to the sparsity and the number of actions.

APA

Flynn, H., Olkhovskaya, J. & Rognon-Vael, P.. (2026). Sparse Nonparametric Contextual Bandits. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-44 Available from https://proceedings.mlr.press/v313/flynn26a.html.

Sparse Nonparametric Contextual Bandits

Abstract

Cite this Paper

Related Material