[edit]
A Fast Sampling Algorithm for Maximum Inner Product Search
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:3004-3012, 2019.
Abstract
Maximum Inner Product Search (MIPS) has been recognized as an important operation for the inference phase of many machine learning algorithms, including matrix factorization, multi-class/multi-label prediction and neural networks. In this paper, we propose Sampling-MIPS, which is the first sampling based algorithm that can be applied to the MIPS problem on a set of general vectors with both positive and negative values. Our Sampling-MIPS algorithm is efficient in terms of both time and sample complexity. In particular, by designing a two-step sampling with alias table, Sampling-MIPS only requires constant time to draw a candidate. In addition, we show that the probability of candidate generation in our algorithm is consistent with the true ranking induced by the value of the corresponding inner products, and derive the sample complexity of Sampling-MIPS to obtain the true candidate. Furthermore, the algorithm can be easily extended to large problems with sparse candidate vectors. Experimental results on real and synthetic datasets show that Sampling-MIPS is consistently better than other previous approaches such as LSH-MIPS, PCA-MIPS and Diamond sampling approach.