A Fast Sampling Algorithm for Maximum Inner Product Search

QIN DING, Hsiang-Fu Yu, Cho-Jui Hsieh
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:3004-3012, 2019.

Abstract

Maximum Inner Product Search (MIPS) has been recognized as an important operation for the inference phase of many machine learning algorithms, including matrix factorization, multi-class/multi-label prediction and neural networks. In this paper, we propose Sampling-MIPS, which is the first sampling based algorithm that can be applied to the MIPS problem on a set of general vectors with both positive and negative values. Our Sampling-MIPS algorithm is efficient in terms of both time and sample complexity. In particular, by designing a two-step sampling with alias table, Sampling-MIPS only requires constant time to draw a candidate. In addition, we show that the probability of candidate generation in our algorithm is consistent with the true ranking induced by the value of the corresponding inner products, and derive the sample complexity of Sampling-MIPS to obtain the true candidate. Furthermore, the algorithm can be easily extended to large problems with sparse candidate vectors. Experimental results on real and synthetic datasets show that Sampling-MIPS is consistently better than other previous approaches such as LSH-MIPS, PCA-MIPS and Diamond sampling approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-ding19a, title = {A Fast Sampling Algorithm for Maximum Inner Product Search}, author = {DING, QIN and Yu, Hsiang-Fu and Hsieh, Cho-Jui}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {3004--3012}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/ding19a/ding19a.pdf}, url = {https://proceedings.mlr.press/v89/ding19a.html}, abstract = {Maximum Inner Product Search (MIPS) has been recognized as an important operation for the inference phase of many machine learning algorithms, including matrix factorization, multi-class/multi-label prediction and neural networks. In this paper, we propose Sampling-MIPS, which is the first sampling based algorithm that can be applied to the MIPS problem on a set of general vectors with both positive and negative values. Our Sampling-MIPS algorithm is efficient in terms of both time and sample complexity. In particular, by designing a two-step sampling with alias table, Sampling-MIPS only requires constant time to draw a candidate. In addition, we show that the probability of candidate generation in our algorithm is consistent with the true ranking induced by the value of the corresponding inner products, and derive the sample complexity of Sampling-MIPS to obtain the true candidate. Furthermore, the algorithm can be easily extended to large problems with sparse candidate vectors. Experimental results on real and synthetic datasets show that Sampling-MIPS is consistently better than other previous approaches such as LSH-MIPS, PCA-MIPS and Diamond sampling approach.} }
Endnote
%0 Conference Paper %T A Fast Sampling Algorithm for Maximum Inner Product Search %A QIN DING %A Hsiang-Fu Yu %A Cho-Jui Hsieh %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-ding19a %I PMLR %P 3004--3012 %U https://proceedings.mlr.press/v89/ding19a.html %V 89 %X Maximum Inner Product Search (MIPS) has been recognized as an important operation for the inference phase of many machine learning algorithms, including matrix factorization, multi-class/multi-label prediction and neural networks. In this paper, we propose Sampling-MIPS, which is the first sampling based algorithm that can be applied to the MIPS problem on a set of general vectors with both positive and negative values. Our Sampling-MIPS algorithm is efficient in terms of both time and sample complexity. In particular, by designing a two-step sampling with alias table, Sampling-MIPS only requires constant time to draw a candidate. In addition, we show that the probability of candidate generation in our algorithm is consistent with the true ranking induced by the value of the corresponding inner products, and derive the sample complexity of Sampling-MIPS to obtain the true candidate. Furthermore, the algorithm can be easily extended to large problems with sparse candidate vectors. Experimental results on real and synthetic datasets show that Sampling-MIPS is consistently better than other previous approaches such as LSH-MIPS, PCA-MIPS and Diamond sampling approach.
APA
DING, Q., Yu, H. & Hsieh, C.. (2019). A Fast Sampling Algorithm for Maximum Inner Product Search. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:3004-3012 Available from https://proceedings.mlr.press/v89/ding19a.html.

Related Material