Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning

Matthijs van der Lende; Matthia Sabatelli; Juan Cardenas-Cartagena

Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning

Matthijs van der Lende, Matthia Sabatelli, Juan Cardenas-Cartagena

Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL), PMLR 265:141-154, 2025.

Abstract

Estimating value functions in Reinforcement Learning (RL) for continuous spaces is challenging. While traditional function approximators, such as linear models, offer interpretability, they are limited in their complexity. In contrast, deep neural networks can model more complex functions but are less interpretable. Gaussian Process (GP) models bridge this gap by offering interpretable uncertainty estimates while modeling complex nonlinear functions. This work introduces a Bayesian nonparametric framework using GPs, including Sparse Variational (SVGP) and Deep GPs (DGP), for off-policy and on-policy learning. Results on popular classic control environments show that SVGPs/DGPs outperform linear models but converge slower than their neural network counterparts. Nevertheless, they do provide valuable insights when it comes to uncertainty estimation and interpretability for RL.

Cite this Paper

BibTeX

@InProceedings{pmlr-v265-lende25a,
  title = 	 {Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning},
  author =       {Lende, Matthijs van der and Sabatelli, Matthia and Cardenas-Cartagena, Juan},
  booktitle = 	 {Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL)},
  pages = 	 {141--154},
  year = 	 {2025},
  editor = 	 {Lutchyn, Tetiana and Ramírez Rivera, Adín and Ricaud, Benjamin},
  volume = 	 {265},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {07--09 Jan},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v265/main/assets/lende25a/lende25a.pdf},
  url = 	 {https://proceedings.mlr.press/v265/lende25a.html},
  abstract = 	 {Estimating value functions in Reinforcement Learning (RL) for continuous spaces is challenging. While traditional function approximators, such as linear models, offer interpretability, they are limited in their complexity. In contrast, deep neural networks can model more complex functions but are less interpretable. Gaussian Process (GP) models bridge this gap by offering interpretable uncertainty estimates while modeling complex nonlinear functions. This work introduces a Bayesian nonparametric framework using GPs, including Sparse Variational (SVGP) and Deep GPs (DGP), for off-policy and on-policy learning. Results on popular classic control environments show that SVGPs/DGPs outperform linear models but converge slower than their neural network counterparts. Nevertheless, they do provide valuable insights when it comes to uncertainty estimation and interpretability for RL.}
}

Endnote

%0 Conference Paper
%T Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning
%A Matthijs van der Lende
%A Matthia Sabatelli
%A Juan Cardenas-Cartagena
%B Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL)
%C Proceedings of Machine Learning Research
%D 2025
%E Tetiana Lutchyn
%E Adín Ramírez Rivera
%E Benjamin Ricaud	
%F pmlr-v265-lende25a
%I PMLR
%P 141--154
%U https://proceedings.mlr.press/v265/lende25a.html
%V 265
%X Estimating value functions in Reinforcement Learning (RL) for continuous spaces is challenging. While traditional function approximators, such as linear models, offer interpretability, they are limited in their complexity. In contrast, deep neural networks can model more complex functions but are less interpretable. Gaussian Process (GP) models bridge this gap by offering interpretable uncertainty estimates while modeling complex nonlinear functions. This work introduces a Bayesian nonparametric framework using GPs, including Sparse Variational (SVGP) and Deep GPs (DGP), for off-policy and on-policy learning. Results on popular classic control environments show that SVGPs/DGPs outperform linear models but converge slower than their neural network counterparts. Nevertheless, they do provide valuable insights when it comes to uncertainty estimation and interpretability for RL.

APA

Lende, M.v.d., Sabatelli, M. & Cardenas-Cartagena, J.. (2025). Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning. Proceedings of the 6th Northern Lights Deep Learning Conference (NLDL), in Proceedings of Machine Learning Research 265:141-154 Available from https://proceedings.mlr.press/v265/lende25a.html.

Interpretable Function Approximation with Gaussian Processes in Value-Based Model-Free Reinforcement Learning

Abstract

Cite this Paper

Related Material