Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling

Sajad Khodadadian; Pranay Sharma; Gauri Joshi; Siva Theja Maguluri

Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling

Sajad Khodadadian, Pranay Sharma, Gauri Joshi, Siva Theja Maguluri

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:10997-11057, 2022.

Abstract

Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling observations from the environment is usually split across multiple agents. However, transferring these observations from the agents to a central location can be prohibitively expensive in terms of the communication cost, and it can also compromise the privacy of each agent’s local behavior policy. In this paper, we consider a federated reinforcement learning framework where multiple agents collaboratively learn a global model, without sharing their individual data and policies. Each agent maintains a local copy of the model and updates it using locally sampled data. Although having N agents enables the sampling of N times more data, it is not clear if it leads to proportional convergence speedup. We propose federated versions of on-policy TD, off-policy TD and Q-learning, and analyze their convergence. For all these algorithms, to the best of our knowledge, we are the first to consider Markovian noise and multiple local updates, and prove a linear convergence speedup with respect to the number of agents. To obtain these results, we show that federated TD and Q-learning are special cases of a general framework for federated stochastic approximation with Markovian noise, and we leverage this framework to provide a unified convergence analysis that applies to all the algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-khodadadian22a,
  title = 	 {Federated Reinforcement Learning: Linear Speedup Under {M}arkovian Sampling},
  author =       {Khodadadian, Sajad and Sharma, Pranay and Joshi, Gauri and Maguluri, Siva Theja},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {10997--11057},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/khodadadian22a/khodadadian22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/khodadadian22a.html},
  abstract = 	 {Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling observations from the environment is usually split across multiple agents. However, transferring these observations from the agents to a central location can be prohibitively expensive in terms of the communication cost, and it can also compromise the privacy of each agent’s local behavior policy. In this paper, we consider a federated reinforcement learning framework where multiple agents collaboratively learn a global model, without sharing their individual data and policies. Each agent maintains a local copy of the model and updates it using locally sampled data. Although having N agents enables the sampling of N times more data, it is not clear if it leads to proportional convergence speedup. We propose federated versions of on-policy TD, off-policy TD and Q-learning, and analyze their convergence. For all these algorithms, to the best of our knowledge, we are the first to consider Markovian noise and multiple local updates, and prove a linear convergence speedup with respect to the number of agents. To obtain these results, we show that federated TD and Q-learning are special cases of a general framework for federated stochastic approximation with Markovian noise, and we leverage this framework to provide a unified convergence analysis that applies to all the algorithms.}
}

Endnote

%0 Conference Paper
%T Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling
%A Sajad Khodadadian
%A Pranay Sharma
%A Gauri Joshi
%A Siva Theja Maguluri
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-khodadadian22a
%I PMLR
%P 10997--11057
%U https://proceedings.mlr.press/v162/khodadadian22a.html
%V 162
%X Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling observations from the environment is usually split across multiple agents. However, transferring these observations from the agents to a central location can be prohibitively expensive in terms of the communication cost, and it can also compromise the privacy of each agent’s local behavior policy. In this paper, we consider a federated reinforcement learning framework where multiple agents collaboratively learn a global model, without sharing their individual data and policies. Each agent maintains a local copy of the model and updates it using locally sampled data. Although having N agents enables the sampling of N times more data, it is not clear if it leads to proportional convergence speedup. We propose federated versions of on-policy TD, off-policy TD and Q-learning, and analyze their convergence. For all these algorithms, to the best of our knowledge, we are the first to consider Markovian noise and multiple local updates, and prove a linear convergence speedup with respect to the number of agents. To obtain these results, we show that federated TD and Q-learning are special cases of a general framework for federated stochastic approximation with Markovian noise, and we leverage this framework to provide a unified convergence analysis that applies to all the algorithms.

APA


Khodadadian, S., Sharma, P., Joshi, G. & Maguluri, S.T.. (2022). Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:10997-11057 Available from https://proceedings.mlr.press/v162/khodadadian22a.html.

Related Material

Download PDF