First-Order Methods for Wasserstein Distributionally Robust MDP

Julien Grand Clement; Christian Kroer

First-Order Methods for Wasserstein Distributionally Robust MDP

Julien Grand Clement, Christian Kroer

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2010-2019, 2021.

Abstract

Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \textit{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. By developing efficient proximal updates, our algorithms achieve a convergence rate of

$O\left(NA^{2.5}S^{3.5}\log(S)\log(\epsilon^{-1})\epsilon^{-1.5} \right)$ for the number of kernels

$N$ in the support of the nominal distribution, states

$S$ , and actions

$A$ ; this rate varies slightly based on the Wasserstein setup. Our dependence on

$N,A$ and

$S$ is significantly better than existing methods, which have a complexity of

$O\left(N^{3.5}A^{3.5}S^{4.5}\log^{2}(\epsilon^{-1}) \right)$ . Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-clement21a,
  title = 	 {First-Order Methods for Wasserstein Distributionally Robust MDP},
  author =       {Clement, Julien Grand and Kroer, Christian},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {2010--2019},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/clement21a/clement21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/clement21a.html},
  abstract = 	 {Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \textit{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. By developing efficient proximal updates, our algorithms achieve a convergence rate of $O\left(NA^{2.5}S^{3.5}\log(S)\log(\epsilon^{-1})\epsilon^{-1.5} \right)$ for the number of kernels $N$ in the support of the nominal distribution, states $S$, and actions $A$; this rate varies slightly based on the Wasserstein setup. Our dependence on $N,A$ and $S$ is significantly better than existing methods, which have a complexity of $O\left(N^{3.5}A^{3.5}S^{4.5}\log^{2}(\epsilon^{-1}) \right)$. Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains.}
}

Endnote

%0 Conference Paper
%T First-Order Methods for Wasserstein Distributionally Robust MDP
%A Julien Grand Clement
%A Christian Kroer
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-clement21a
%I PMLR
%P 2010--2019
%U https://proceedings.mlr.press/v139/clement21a.html
%V 139
%X Markov decision processes (MDPs) are known to be sensitive to parameter specification. Distributionally robust MDPs alleviate this issue by allowing for \textit{ambiguity sets} which give a set of possible distributions over parameter sets. The goal is to find an optimal policy with respect to the worst-case parameter distribution. We propose a framework for solving Distributionally robust MDPs via first-order methods, and instantiate it for several types of Wasserstein ambiguity sets. By developing efficient proximal updates, our algorithms achieve a convergence rate of $O\left(NA^{2.5}S^{3.5}\log(S)\log(\epsilon^{-1})\epsilon^{-1.5} \right)$ for the number of kernels $N$ in the support of the nominal distribution, states $S$, and actions $A$; this rate varies slightly based on the Wasserstein setup. Our dependence on $N,A$ and $S$ is significantly better than existing methods, which have a complexity of $O\left(N^{3.5}A^{3.5}S^{4.5}\log^{2}(\epsilon^{-1}) \right)$. Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains.

APA


Clement, J.G. & Kroer, C.. (2021). First-Order Methods for Wasserstein Distributionally Robust MDP. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2010-2019 Available from https://proceedings.mlr.press/v139/clement21a.html.

First-Order Methods for Wasserstein Distributionally Robust MDP

Abstract

Cite this Paper

Related Material