Averaging Stochastic Gradient Descent on Riemannian Manifolds

Nilesh Tripuraneni, Nicolas Flammarion, Francis Bach, Michael I. Jordan
Proceedings of the 31st Conference On Learning Theory, PMLR 75:650-687, 2018.

Abstract

We consider the minimization of a function defined on a Riemannian manifold $\mathcal{M}$ accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on $\mathcal{M}$ to an averaged iterate sequence with a robust and fast $O(1/n)$ convergence rate. We then present an application of our framework to geodesically-strongly-convex (and possibly Euclidean non-convex) problems. Finally, we demonstrate how these ideas apply to the case of streaming $k$-PCA, where we show how to accelerate the slow rate of the randomized power method (without requiring knowledge of the eigengap) into a robust algorithm achieving the optimal rate of convergence.

Cite this Paper


BibTeX
@InProceedings{pmlr-v75-tripuraneni18a, title = {Averaging Stochastic Gradient Descent on {R}iemannian Manifolds}, author = {Tripuraneni, Nilesh and Flammarion, Nicolas and Bach, Francis and Jordan, Michael I.}, booktitle = {Proceedings of the 31st Conference On Learning Theory}, pages = {650--687}, year = {2018}, editor = {Bubeck, S├ębastien and Perchet, Vianney and Rigollet, Philippe}, volume = {75}, series = {Proceedings of Machine Learning Research}, month = {06--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v75/tripuraneni18a/tripuraneni18a.pdf}, url = {https://proceedings.mlr.press/v75/tripuraneni18a.html}, abstract = {We consider the minimization of a function defined on a Riemannian manifold $\mathcal{M}$ accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on $\mathcal{M}$ to an averaged iterate sequence with a robust and fast $O(1/n)$ convergence rate. We then present an application of our framework to geodesically-strongly-convex (and possibly Euclidean non-convex) problems. Finally, we demonstrate how these ideas apply to the case of streaming $k$-PCA, where we show how to accelerate the slow rate of the randomized power method (without requiring knowledge of the eigengap) into a robust algorithm achieving the optimal rate of convergence.} }
Endnote
%0 Conference Paper %T Averaging Stochastic Gradient Descent on Riemannian Manifolds %A Nilesh Tripuraneni %A Nicolas Flammarion %A Francis Bach %A Michael I. Jordan %B Proceedings of the 31st Conference On Learning Theory %C Proceedings of Machine Learning Research %D 2018 %E S├ębastien Bubeck %E Vianney Perchet %E Philippe Rigollet %F pmlr-v75-tripuraneni18a %I PMLR %P 650--687 %U https://proceedings.mlr.press/v75/tripuraneni18a.html %V 75 %X We consider the minimization of a function defined on a Riemannian manifold $\mathcal{M}$ accessible only through unbiased estimates of its gradients. We develop a geometric framework to transform a sequence of slowly converging iterates generated from stochastic gradient descent (SGD) on $\mathcal{M}$ to an averaged iterate sequence with a robust and fast $O(1/n)$ convergence rate. We then present an application of our framework to geodesically-strongly-convex (and possibly Euclidean non-convex) problems. Finally, we demonstrate how these ideas apply to the case of streaming $k$-PCA, where we show how to accelerate the slow rate of the randomized power method (without requiring knowledge of the eigengap) into a robust algorithm achieving the optimal rate of convergence.
APA
Tripuraneni, N., Flammarion, N., Bach, F. & Jordan, M.I.. (2018). Averaging Stochastic Gradient Descent on Riemannian Manifolds. Proceedings of the 31st Conference On Learning Theory, in Proceedings of Machine Learning Research 75:650-687 Available from https://proceedings.mlr.press/v75/tripuraneni18a.html.

Related Material