Optimization without Retraction on the Random Generalized Stiefel Manifold

Simon Vary, Pierre Ablin, Bin Gao, Pierre-Antoine Absil
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:49226-49248, 2024.

Abstract

Optimization over the set of matrices $X$ that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as the canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods that require a fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimates of $B$. Our method does not enforce the constraint in every iteration; instead, it produces iterations that converge to critical points on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian optimization counterparts that require the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and the GEVP.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-vary24a, title = {Optimization without Retraction on the Random Generalized Stiefel Manifold}, author = {Vary, Simon and Ablin, Pierre and Gao, Bin and Absil, Pierre-Antoine}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {49226--49248}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/vary24a/vary24a.pdf}, url = {https://proceedings.mlr.press/v235/vary24a.html}, abstract = {Optimization over the set of matrices $X$ that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as the canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods that require a fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimates of $B$. Our method does not enforce the constraint in every iteration; instead, it produces iterations that converge to critical points on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian optimization counterparts that require the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and the GEVP.} }
Endnote
%0 Conference Paper %T Optimization without Retraction on the Random Generalized Stiefel Manifold %A Simon Vary %A Pierre Ablin %A Bin Gao %A Pierre-Antoine Absil %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-vary24a %I PMLR %P 49226--49248 %U https://proceedings.mlr.press/v235/vary24a.html %V 235 %X Optimization over the set of matrices $X$ that satisfy $X^\top B X = I_p$, referred to as the generalized Stiefel manifold, appears in many applications involving sampled covariance matrices such as the canonical correlation analysis (CCA), independent component analysis (ICA), and the generalized eigenvalue problem (GEVP). Solving these problems is typically done by iterative methods that require a fully formed $B$. We propose a cheap stochastic iterative method that solves the optimization problem while having access only to a random estimates of $B$. Our method does not enforce the constraint in every iteration; instead, it produces iterations that converge to critical points on the generalized Stiefel manifold defined in expectation. The method has lower per-iteration cost, requires only matrix multiplications, and has the same convergence rates as its Riemannian optimization counterparts that require the full matrix $B$. Experiments demonstrate its effectiveness in various machine learning applications involving generalized orthogonality constraints, including CCA, ICA, and the GEVP.
APA
Vary, S., Ablin, P., Gao, B. & Absil, P.. (2024). Optimization without Retraction on the Random Generalized Stiefel Manifold. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:49226-49248 Available from https://proceedings.mlr.press/v235/vary24a.html.

Related Material