Faster Maximum Inner Product Search in High Dimensions

Mo Tiwari; Ryan Kang; Jaeyong Lee; Donghyun Lee; Christopher J Piech; Sebastian Thrun; Ilan Shomorony; Martin Jinye Zhang

Faster Maximum Inner Product Search in High Dimensions

Mo Tiwari, Ryan Kang, Jaeyong Lee, Donghyun Lee, Christopher J Piech, Sebastian Thrun, Ilan Shomorony, Martin Jinye Zhang

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:48344-48361, 2024.

Abstract

Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and

$n$ other vectors in

$d$ dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as

$O(\sqrt{d})$ with respect to

$d$ , which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from

$O(\sqrt{d})$ to

$O(1)$ with respect to

$d$ . We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPS-

$\alpha$ , which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to

$n$ . All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.

Cite this Paper

BibTeX


@InProceedings{pmlr-v235-tiwari24a,
  title = 	 {Faster Maximum Inner Product Search in High Dimensions},
  author =       {Tiwari, Mo and Kang, Ryan and Lee, Jaeyong and Lee, Donghyun and Piech, Christopher J and Thrun, Sebastian and Shomorony, Ilan and Zhang, Martin Jinye},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {48344--48361},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/tiwari24a/tiwari24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/tiwari24a.html},
  abstract = 	 {Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and $n$ other vectors in $d$ dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as $O(\sqrt{d})$ with respect to $d$, which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from $O(\sqrt{d})$ to $O(1)$ with respect to $d$. We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPS-$\alpha$, which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to $n$. All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.}
}

Endnote

%0 Conference Paper
%T Faster Maximum Inner Product Search in High Dimensions
%A Mo Tiwari
%A Ryan Kang
%A Jaeyong Lee
%A Donghyun Lee
%A Christopher J Piech
%A Sebastian Thrun
%A Ilan Shomorony
%A Martin Jinye Zhang
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-tiwari24a
%I PMLR
%P 48344--48361
%U https://proceedings.mlr.press/v235/tiwari24a.html
%V 235
%X Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and $n$ other vectors in $d$ dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as $O(\sqrt{d})$ with respect to $d$, which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from $O(\sqrt{d})$ to $O(1)$ with respect to $d$. We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPS-$\alpha$, which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to $n$. All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.

APA


Tiwari, M., Kang, R., Lee, J., Lee, D., Piech, C.J., Thrun, S., Shomorony, I. & Zhang, M.J.. (2024). Faster Maximum Inner Product Search in High Dimensions. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:48344-48361 Available from https://proceedings.mlr.press/v235/tiwari24a.html.

Faster Maximum Inner Product Search in High Dimensions

Abstract

Cite this Paper

Related Material