Gradient descent algorithms for Bures-Wasserstein barycenters

Sinho Chewi, Tyler Maunu, Philippe Rigollet, Austin J. Stromme
Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:1276-1304, 2020.

Abstract

We study first order methods to compute the barycenter of a probability distribution $P$ over the space of probability measures with finite second moment. We develop a framework to derive global rates of convergence for both gradient descent and stochastic gradient descent despite the fact that the barycenter functional is not geodesically convex. Our analysis overcomes this technical hurdle by employing a Polyak-Ł{}ojasiewicz (PL) inequality and relies on tools from optimal transport and metric geometry. In turn, we establish a PL inequality when $P$ is supported on the Bures-Wasserstein manifold of Gaussian probability measures. It leads to the first global rates of convergence for first order methods in this context.

Cite this Paper


BibTeX
@InProceedings{pmlr-v125-chewi20a, title = {Gradient descent algorithms for {B}ures-{W}asserstein barycenters}, author = {Chewi, Sinho and Maunu, Tyler and Rigollet, Philippe and Stromme, {Austin J.}}, booktitle = {Proceedings of Thirty Third Conference on Learning Theory}, pages = {1276--1304}, year = {2020}, editor = {Abernethy, Jacob and Agarwal, Shivani}, volume = {125}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v125/chewi20a/chewi20a.pdf}, url = {https://proceedings.mlr.press/v125/chewi20a.html}, abstract = { We study first order methods to compute the barycenter of a probability distribution $P$ over the space of probability measures with finite second moment. We develop a framework to derive global rates of convergence for both gradient descent and stochastic gradient descent despite the fact that the barycenter functional is not geodesically convex. Our analysis overcomes this technical hurdle by employing a Polyak-Ł{}ojasiewicz (PL) inequality and relies on tools from optimal transport and metric geometry. In turn, we establish a PL inequality when $P$ is supported on the Bures-Wasserstein manifold of Gaussian probability measures. It leads to the first global rates of convergence for first order methods in this context. } }
Endnote
%0 Conference Paper %T Gradient descent algorithms for Bures-Wasserstein barycenters %A Sinho Chewi %A Tyler Maunu %A Philippe Rigollet %A Austin J. Stromme %B Proceedings of Thirty Third Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Jacob Abernethy %E Shivani Agarwal %F pmlr-v125-chewi20a %I PMLR %P 1276--1304 %U https://proceedings.mlr.press/v125/chewi20a.html %V 125 %X We study first order methods to compute the barycenter of a probability distribution $P$ over the space of probability measures with finite second moment. We develop a framework to derive global rates of convergence for both gradient descent and stochastic gradient descent despite the fact that the barycenter functional is not geodesically convex. Our analysis overcomes this technical hurdle by employing a Polyak-Ł{}ojasiewicz (PL) inequality and relies on tools from optimal transport and metric geometry. In turn, we establish a PL inequality when $P$ is supported on the Bures-Wasserstein manifold of Gaussian probability measures. It leads to the first global rates of convergence for first order methods in this context.
APA
Chewi, S., Maunu, T., Rigollet, P. & Stromme, A.J.. (2020). Gradient descent algorithms for Bures-Wasserstein barycenters. Proceedings of Thirty Third Conference on Learning Theory, in Proceedings of Machine Learning Research 125:1276-1304 Available from https://proceedings.mlr.press/v125/chewi20a.html.

Related Material