Locally Differentially Private Reinforcement
 Learning for Linear Mixture Markov Decision
 Processes

Chonghua Liao; Jiafan He; Quanquan Gu

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

Chonghua Liao, Jiafan He, Quanquan Gu

Proceedings of The 14th Asian Conference on Machine Learning, PMLR 189:627-642, 2023.

Abstract

Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users’ private and sensitive data. To protect the users’ privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. We propose a novel

$(\varepsilon, \delta)$ -LDP algorithm for learning a class of Markov decision processes (MDPs) dubbed linear mixture MDPs, and obtains an

$\tilde{\mathcal{O}}( d^{5/4}H^{7/4}T^{3/4}\left(\log(1/\delta)\right)^{1/4}\sqrt{1/\varepsilon})$ regret, where

$d$ is the dimension of feature mapping,

$H$ is the length of the planning horizon, and

$T$ is the number of interactions with the environment. We also prove a lower bound

$\Omega(dH\sqrt{T}/\left(e^{\varepsilon}(e^{\varepsilon}-1)\right))$ for learning linear mixture MDPs under

$\varepsilon$ -LDP constraint. Experiments on synthetic datasets verify the effectiveness of our algorithm. To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation.

Cite this Paper

BibTeX


@InProceedings{pmlr-v189-liao23a,
  title = 	 {Locally Differentially Private Reinforcement
 Learning for Linear Mixture Markov Decision
 Processes},
  author =       {Liao, Chonghua and He, Jiafan and Gu, Quanquan},
  booktitle = 	 {Proceedings of The 14th Asian Conference on Machine
 Learning},
  pages = 	 {627--642},
  year = 	 {2023},
  editor = 	 {Khan, Emtiyaz and Gonen, Mehmet},
  volume = 	 {189},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {12--14 Dec},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v189/liao23a/liao23a.pdf},
  url = 	 {https://proceedings.mlr.press/v189/liao23a.html},
  abstract = 	 {Reinforcement learning (RL) algorithms can be used
 to provide personalized services, which rely on
 users’ private and sensitive data. To protect the
 users’ privacy, privacy-preserving RL algorithms are
 in demand. In this paper, we study RL with linear
 function approximation and local differential
 privacy (LDP) guarantees. We propose a novel
 $(\varepsilon, \delta)$-LDP algorithm for learning a
 class of Markov decision processes (MDPs) dubbed
 linear mixture MDPs, and obtains an
 $\tilde{\mathcal{O}}(
 d^{5/4}H^{7/4}T^{3/4}\left(\log(1/\delta)\right)^{1/4}\sqrt{1/\varepsilon})$
 regret, where $d$ is the dimension of feature
 mapping, $H$ is the length of the planning horizon,
 and $T$ is the number of interactions with the
 environment.  We also prove a lower bound
 $\Omega(dH\sqrt{T}/\left(e^{\varepsilon}(e^{\varepsilon}-1)\right))$
 for learning linear mixture MDPs under
 $\varepsilon$-LDP constraint. Experiments on
 synthetic datasets verify the effectiveness of our
 algorithm. To the best of our knowledge, this is the
 first provable privacy-preserving RL algorithm with
 linear function approximation.}
}

Endnote

%0 Conference Paper
%T Locally Differentially Private Reinforcement
 Learning for Linear Mixture Markov Decision
 Processes
%A Chonghua Liao
%A Jiafan He
%A Quanquan Gu
%B Proceedings of The 14th Asian Conference on Machine
 Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Emtiyaz Khan
%E Mehmet Gonen	
%F pmlr-v189-liao23a
%I PMLR
%P 627--642
%U https://proceedings.mlr.press/v189/liao23a.html
%V 189
%X Reinforcement learning (RL) algorithms can be used
 to provide personalized services, which rely on
 users’ private and sensitive data. To protect the
 users’ privacy, privacy-preserving RL algorithms are
 in demand. In this paper, we study RL with linear
 function approximation and local differential
 privacy (LDP) guarantees. We propose a novel
 $(\varepsilon, \delta)$-LDP algorithm for learning a
 class of Markov decision processes (MDPs) dubbed
 linear mixture MDPs, and obtains an
 $\tilde{\mathcal{O}}(
 d^{5/4}H^{7/4}T^{3/4}\left(\log(1/\delta)\right)^{1/4}\sqrt{1/\varepsilon})$
 regret, where $d$ is the dimension of feature
 mapping, $H$ is the length of the planning horizon,
 and $T$ is the number of interactions with the
 environment.  We also prove a lower bound
 $\Omega(dH\sqrt{T}/\left(e^{\varepsilon}(e^{\varepsilon}-1)\right))$
 for learning linear mixture MDPs under
 $\varepsilon$-LDP constraint. Experiments on
 synthetic datasets verify the effectiveness of our
 algorithm. To the best of our knowledge, this is the
 first provable privacy-preserving RL algorithm with
 linear function approximation.

APA


Liao, C., He, J. & Gu, Q.. (2023). Locally Differentially Private Reinforcement
 Learning for Linear Mixture Markov Decision
 Processes. Proceedings of The 14th Asian Conference on Machine
 Learning, in Proceedings of Machine Learning Research 189:627-642 Available from https://proceedings.mlr.press/v189/liao23a.html.

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

Abstract

Cite this Paper

Related Material