Decentralized Riemannian Gradient Descent on the Stiefel Manifold

Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:1594-1605, 2021.

Abstract

We consider a distributed non-convex optimization where a network of agents aims at minimizing a global function over the Stiefel manifold. The global function is represented as a finite sum of smooth local functions, where each local function is associated with one agent and agents communicate with each other over an undirected connected graph. The problem is non-convex as local functions are possibly non-convex (but smooth) and the Steifel manifold is a non-convex set. We present a decentralized Riemannian stochastic gradient method (DRSGD) with the convergence rate of $\mathcal{O}(1/\sqrt{K})$ to a stationary point. To have exact convergence with constant stepsize, we also propose a decentralized Riemannian gradient tracking algorithm (DRGTA) with the convergence rate of $\mathcal{O}(1/K)$ to a stationary point. We use multi-step consensus to preserve the iteration in the local (consensus) region. DRGTA is the first decentralized algorithm with exact convergence for distributed optimization on Stiefel manifold.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-chen21g, title = {Decentralized Riemannian Gradient Descent on the Stiefel Manifold}, author = {Chen, Shixiang and Garcia, Alfredo and Hong, Mingyi and Shahrampour, Shahin}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {1594--1605}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/chen21g/chen21g.pdf}, url = {https://proceedings.mlr.press/v139/chen21g.html}, abstract = {We consider a distributed non-convex optimization where a network of agents aims at minimizing a global function over the Stiefel manifold. The global function is represented as a finite sum of smooth local functions, where each local function is associated with one agent and agents communicate with each other over an undirected connected graph. The problem is non-convex as local functions are possibly non-convex (but smooth) and the Steifel manifold is a non-convex set. We present a decentralized Riemannian stochastic gradient method (DRSGD) with the convergence rate of $\mathcal{O}(1/\sqrt{K})$ to a stationary point. To have exact convergence with constant stepsize, we also propose a decentralized Riemannian gradient tracking algorithm (DRGTA) with the convergence rate of $\mathcal{O}(1/K)$ to a stationary point. We use multi-step consensus to preserve the iteration in the local (consensus) region. DRGTA is the first decentralized algorithm with exact convergence for distributed optimization on Stiefel manifold.} }
Endnote
%0 Conference Paper %T Decentralized Riemannian Gradient Descent on the Stiefel Manifold %A Shixiang Chen %A Alfredo Garcia %A Mingyi Hong %A Shahin Shahrampour %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-chen21g %I PMLR %P 1594--1605 %U https://proceedings.mlr.press/v139/chen21g.html %V 139 %X We consider a distributed non-convex optimization where a network of agents aims at minimizing a global function over the Stiefel manifold. The global function is represented as a finite sum of smooth local functions, where each local function is associated with one agent and agents communicate with each other over an undirected connected graph. The problem is non-convex as local functions are possibly non-convex (but smooth) and the Steifel manifold is a non-convex set. We present a decentralized Riemannian stochastic gradient method (DRSGD) with the convergence rate of $\mathcal{O}(1/\sqrt{K})$ to a stationary point. To have exact convergence with constant stepsize, we also propose a decentralized Riemannian gradient tracking algorithm (DRGTA) with the convergence rate of $\mathcal{O}(1/K)$ to a stationary point. We use multi-step consensus to preserve the iteration in the local (consensus) region. DRGTA is the first decentralized algorithm with exact convergence for distributed optimization on Stiefel manifold.
APA
Chen, S., Garcia, A., Hong, M. & Shahrampour, S.. (2021). Decentralized Riemannian Gradient Descent on the Stiefel Manifold. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:1594-1605 Available from https://proceedings.mlr.press/v139/chen21g.html.

Related Material