The matrix-vector complexity of Ax=b

Michał Dereziński; Ethan N Epperly; Raphael A Meyer

The matrix-vector complexity of Ax=b

Michał Dereziński, Ethan N Epperly, Raphael A Meyer

Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:1737-1770, 2026.

Abstract

Matrix–vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix–vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for any matrix–vector algorithm which is allowed the use of randomization and can perform products with both a matrix and its transpose, $\Omega(\kappa \log(1/\varepsilon))$ matrix–vector products are necessary to solve a linear system with condition number $\kappa$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix–vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix–vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

Cite this Paper

BibTeX

@InProceedings{pmlr-v336-derezinski26a,
  title = 	 {The matrix-vector complexity of Ax=b},
  author =       {Derezi{\'n}ski, Micha{\l} and Epperly, Ethan N and Meyer, Raphael A},
  booktitle = 	 {Proceedings of Thirty Ninth Conference on Learning Theory},
  pages = 	 {1737--1770},
  year = 	 {2026},
  editor = 	 {Hanneke, Steve and Lattimore, Tor},
  volume = 	 {336},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v336/main/assets/derezinski26a/derezinski26a.pdf},
  url = 	 {https://proceedings.mlr.press/v336/derezinski26a.html},
  abstract = 	 {Matrix–vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix–vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for any matrix–vector algorithm which is allowed the use of randomization and can perform products with both a matrix and its transpose, $\Omega(\kappa \log(1/\varepsilon))$ matrix–vector products are necessary to solve a linear system with condition number $\kappa$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix–vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix–vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.}
}

Endnote

%0 Conference Paper
%T The matrix-vector complexity of Ax=b
%A Michał Dereziński
%A Ethan N Epperly
%A Raphael A Meyer
%B Proceedings of Thirty Ninth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Steve Hanneke
%E Tor Lattimore	
%F pmlr-v336-derezinski26a
%I PMLR
%P 1737--1770
%U https://proceedings.mlr.press/v336/derezinski26a.html
%V 336
%X Matrix–vector algorithms, particularly Krylov subspace methods, are widely viewed as the most effective algorithms for solving large systems of linear equations. This paper establishes lower bounds on the worst-case number of matrix–vector products needed by such an algorithm to approximately solve a general linear system. The first main result is that, for any matrix–vector algorithm which is allowed the use of randomization and can perform products with both a matrix and its transpose, $\Omega(\kappa \log(1/\varepsilon))$ matrix–vector products are necessary to solve a linear system with condition number $\kappa$ to accuracy $\varepsilon$, matching an upper bound for conjugate gradient on the normal equations. The second main result is that one-sided algorithms, which lack access to the transpose, must use $n$ matrix–vector products to solve an $n \times n$ linear system, even when the problem is perfectly conditioned. Both main results include explicit constants that match known upper bounds up to a factor of four. These results rigorously demonstrate the limitations of matrix–vector algorithms and confirm the optimality of widely used Krylov subspace algorithms.

APA

Dereziński, M., Epperly, E.N. & Meyer, R.A.. (2026). The matrix-vector complexity of Ax=b. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:1737-1770 Available from https://proceedings.mlr.press/v336/derezinski26a.html.

Related Material

Download PDF