Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

Wu Lin, Valentin Duruisseaux, Melvin Leok, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:21026-21050, 2023.

Abstract

Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning in low precision settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-lin23c, title = {Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning}, author = {Lin, Wu and Duruisseaux, Valentin and Leok, Melvin and Nielsen, Frank and Khan, Mohammad Emtiyaz and Schmidt, Mark}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {21026--21050}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/lin23c/lin23c.pdf}, url = {https://proceedings.mlr.press/v202/lin23c.html}, abstract = {Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning in low precision settings.} }
Endnote
%0 Conference Paper %T Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning %A Wu Lin %A Valentin Duruisseaux %A Melvin Leok %A Frank Nielsen %A Mohammad Emtiyaz Khan %A Mark Schmidt %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-lin23c %I PMLR %P 21026--21050 %U https://proceedings.mlr.press/v202/lin23c.html %V 202 %X Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations. Here, we simplify such difficulties for a class of structured symmetric positive-definite matrices with the affine-invariant metric. We do so by proposing a generalized version of the Riemannian normal coordinates that dynamically orthonormalizes the metric and locally converts the problem into an unconstrained problem in the Euclidean space. We use our approach to simplify existing approaches for structured covariances and develop matrix-inverse-free $2^\text{nd}$-order optimizers for deep learning in low precision settings.
APA
Lin, W., Duruisseaux, V., Leok, M., Nielsen, F., Khan, M.E. & Schmidt, M.. (2023). Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:21026-21050 Available from https://proceedings.mlr.press/v202/lin23c.html.

Related Material