Near-Optimal Sample Complexity in Reward-Free Kernel-based Reinforcement Learning

Aya Kayal, Sattar Vakili, Laura Toni, Alberto Bernacchia
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:874-882, 2025.

Abstract

Reinforcement Learning (RL) problems are being considered under increasingly more complex structures. While tabular and linear models have been thoroughly explored, the analytical study of RL under non-linear function approximation, especially kernel-based models, has recently gained traction for their strong representational capacity and theoretical tractability. In this context, we examine the question of statistical efficiency in kernel-based RL within the reward-free RL framework, specifically asking: how many samples are required to design a near-optimal policy? Existing work addresses this question under restrictive assumptions about the class of kernel functions. We first explore this question assuming a generative model, then relax this assumption at the cost of increasing the sample complexity by a factor of $H$, the episode length. We tackle this fundamental problem using a broad class of kernels and a simpler algorithm compared to prior work. Our approach derives new confidence intervals for kernel ridge regression, specific to our RL setting, that may be of broader applicability. We further validate our theoretical findings through simulations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-kayal25a, title = {Near-Optimal Sample Complexity in Reward-Free Kernel-based Reinforcement Learning}, author = {Kayal, Aya and Vakili, Sattar and Toni, Laura and Bernacchia, Alberto}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {874--882}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/kayal25a/kayal25a.pdf}, url = {https://proceedings.mlr.press/v258/kayal25a.html}, abstract = {Reinforcement Learning (RL) problems are being considered under increasingly more complex structures. While tabular and linear models have been thoroughly explored, the analytical study of RL under non-linear function approximation, especially kernel-based models, has recently gained traction for their strong representational capacity and theoretical tractability. In this context, we examine the question of statistical efficiency in kernel-based RL within the reward-free RL framework, specifically asking: how many samples are required to design a near-optimal policy? Existing work addresses this question under restrictive assumptions about the class of kernel functions. We first explore this question assuming a generative model, then relax this assumption at the cost of increasing the sample complexity by a factor of $H$, the episode length. We tackle this fundamental problem using a broad class of kernels and a simpler algorithm compared to prior work. Our approach derives new confidence intervals for kernel ridge regression, specific to our RL setting, that may be of broader applicability. We further validate our theoretical findings through simulations.} }
Endnote
%0 Conference Paper %T Near-Optimal Sample Complexity in Reward-Free Kernel-based Reinforcement Learning %A Aya Kayal %A Sattar Vakili %A Laura Toni %A Alberto Bernacchia %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-kayal25a %I PMLR %P 874--882 %U https://proceedings.mlr.press/v258/kayal25a.html %V 258 %X Reinforcement Learning (RL) problems are being considered under increasingly more complex structures. While tabular and linear models have been thoroughly explored, the analytical study of RL under non-linear function approximation, especially kernel-based models, has recently gained traction for their strong representational capacity and theoretical tractability. In this context, we examine the question of statistical efficiency in kernel-based RL within the reward-free RL framework, specifically asking: how many samples are required to design a near-optimal policy? Existing work addresses this question under restrictive assumptions about the class of kernel functions. We first explore this question assuming a generative model, then relax this assumption at the cost of increasing the sample complexity by a factor of $H$, the episode length. We tackle this fundamental problem using a broad class of kernels and a simpler algorithm compared to prior work. Our approach derives new confidence intervals for kernel ridge regression, specific to our RL setting, that may be of broader applicability. We further validate our theoretical findings through simulations.
APA
Kayal, A., Vakili, S., Toni, L. & Bernacchia, A.. (2025). Near-Optimal Sample Complexity in Reward-Free Kernel-based Reinforcement Learning. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:874-882 Available from https://proceedings.mlr.press/v258/kayal25a.html.

Related Material