A Fast Sampling Algorithm for Maximum Inner Product Search
[edit]
Proceedings of Machine Learning Research, PMLR 89:30043012, 2019.
Abstract
Maximum Inner Product Search (MIPS) has been recognized as an important operation for the inference phase of many machine learning algorithms, including matrix factorization, multiclass/multilabel prediction and neural networks. In this paper, we propose SamplingMIPS, which is the first sampling based algorithm that can be applied to the MIPS problem on a set of general vectors with both positive and negative values. Our SamplingMIPS algorithm is efficient in terms of both time and sample complexity. In particular, by designing a twostep sampling with alias table, SamplingMIPS only requires constant time to draw a candidate. In addition, we show that the probability of candidate generation in our algorithm is consistent with the true ranking induced by the value of the corresponding inner products, and derive the sample complexity of SamplingMIPS to obtain the true candidate. Furthermore, the algorithm can be easily extended to large problems with sparse candidate vectors. Experimental results on real and synthetic datasets show that SamplingMIPS is consistently better than other previous approaches such as LSHMIPS, PCAMIPS and Diamond sampling approach.
Related Material


