[edit]
Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes
Proceedings of The 14th Asian Conference on Machine
Learning, PMLR 189:627-642, 2023.
Abstract
Reinforcement learning (RL) algorithms can be used
to provide personalized services, which rely on
users’ private and sensitive data. To protect the
users’ privacy, privacy-preserving RL algorithms are
in demand. In this paper, we study RL with linear
function approximation and local differential
privacy (LDP) guarantees. We propose a novel
(ε,δ)-LDP algorithm for learning a
class of Markov decision processes (MDPs) dubbed
linear mixture MDPs, and obtains an
˜O(d5/4H7/4T3/4(log(1/δ))1/4√1/ε)
regret, where d is the dimension of feature
mapping, H is the length of the planning horizon,
and T is the number of interactions with the
environment. We also prove a lower bound
Ω(dH√T/(eε(eε−1)))
for learning linear mixture MDPs under
ε-LDP constraint. Experiments on
synthetic datasets verify the effectiveness of our
algorithm. To the best of our knowledge, this is the
first provable privacy-preserving RL algorithm with
linear function approximation.