The matrix-vector complexity of Ax=b

Michał Dereziński, Ethan N Epperly, Raphael A Meyer
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:1737-1770, 2026.

Abstract

Matrix–vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix–vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for any matrix–vector algorithm which is allowed the use of randomization and can perform products with both a matrix and its transpose, $\Omega(\kappa \log(1/\varepsilon))$ matrix–vector products are necessary to solve a linear system with condition number $\kappa$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix–vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix–vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-derezinski26a, title = {The matrix-vector complexity of Ax=b}, author = {Derezi{\'n}ski, Micha{\l} and Epperly, Ethan N and Meyer, Raphael A}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {1737--1770}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/derezinski26a/derezinski26a.pdf}, url = {https://proceedings.mlr.press/v336/derezinski26a.html}, abstract = {Matrix–vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix–vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for any matrix–vector algorithm which is allowed the use of randomization and can perform products with both a matrix and its transpose, $\Omega(\kappa \log(1/\varepsilon))$ matrix–vector products are necessary to solve a linear system with condition number $\kappa$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix–vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix–vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.} }
Endnote
%0 Conference Paper %T The matrix-vector complexity of Ax=b %A Michał Dereziński %A Ethan N Epperly %A Raphael A Meyer %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-derezinski26a %I PMLR %P 1737--1770 %U https://proceedings.mlr.press/v336/derezinski26a.html %V 336 %X Matrix–vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix–vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for any matrix–vector algorithm which is allowed the use of randomization and can perform products with both a matrix and its transpose, $\Omega(\kappa \log(1/\varepsilon))$ matrix–vector products are necessary to solve a linear system with condition number $\kappa$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix–vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix–vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.
APA
Dereziński, M., Epperly, E.N. & Meyer, R.A.. (2026). The matrix-vector complexity of Ax=b. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:1737-1770 Available from https://proceedings.mlr.press/v336/derezinski26a.html.

Related Material