Faster Maximum Inner Product Search in High Dimensions

Mo Tiwari, Ryan Kang, Jaeyong Lee, Donghyun Lee, Christopher J Piech, Sebastian Thrun, Ilan Shomorony, Martin Jinye Zhang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:48344-48361, 2024.

Abstract

Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and n other vectors in d dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as O(d) with respect to d, which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from O(d) to O(1) with respect to d. We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPS-α, which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to n. All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-tiwari24a, title = {Faster Maximum Inner Product Search in High Dimensions}, author = {Tiwari, Mo and Kang, Ryan and Lee, Jaeyong and Lee, Donghyun and Piech, Christopher J and Thrun, Sebastian and Shomorony, Ilan and Zhang, Martin Jinye}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {48344--48361}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/tiwari24a/tiwari24a.pdf}, url = {https://proceedings.mlr.press/v235/tiwari24a.html}, abstract = {Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and $n$ other vectors in $d$ dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as $O(\sqrt{d})$ with respect to $d$, which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from $O(\sqrt{d})$ to $O(1)$ with respect to $d$. We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPS-$\alpha$, which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to $n$. All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.} }
Endnote
%0 Conference Paper %T Faster Maximum Inner Product Search in High Dimensions %A Mo Tiwari %A Ryan Kang %A Jaeyong Lee %A Donghyun Lee %A Christopher J Piech %A Sebastian Thrun %A Ilan Shomorony %A Martin Jinye Zhang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-tiwari24a %I PMLR %P 48344--48361 %U https://proceedings.mlr.press/v235/tiwari24a.html %V 235 %X Maximum Inner Product Search (MIPS) is a ubiquitous task in machine learning applications. Given a query vector and $n$ other vectors in $d$ dimensions, the MIPS problem is to find the atom that has the highest inner product with the query vector. Existing MIPS algorithms scale at least as $O(\sqrt{d})$ with respect to $d$, which becomes computationally prohibitive in high-dimensional settings. In this work, we present BanditMIPS, a novel randomized algorithm that provably improves the state-of-the-art complexity from $O(\sqrt{d})$ to $O(1)$ with respect to $d$. We validate the scaling of BanditMIPS and demonstrate that BanditMIPS outperforms prior state-of-the-art MIPS algorithms in sample complexity, wall-clock time, and precision/speedup tradeoff across a variety of experimental settings. Furthermore, we propose a variant of our algorithm, named BanditMIPS-$\alpha$, which improves upon BanditMIPS by employing non-uniform sampling across coordinates. We also demonstrate the usefulness of BanditMIPS in problems for which MIPS is a subroutine, including Matching Pursuit and Fourier analysis. Finally, we demonstrate that BanditMIPS can be used in conjunction with preprocessing techniques to improve its complexity with respect to $n$. All of our experimental results are reproducible via a 1-line script at github.com/ThrunGroup/BanditMIPS.
APA
Tiwari, M., Kang, R., Lee, J., Lee, D., Piech, C.J., Thrun, S., Shomorony, I. & Zhang, M.J.. (2024). Faster Maximum Inner Product Search in High Dimensions. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:48344-48361 Available from https://proceedings.mlr.press/v235/tiwari24a.html.

Related Material