Accelerating Large-Scale Inference with Anisotropic Vector Quantization

Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3887-3896, 2020.

Abstract

Quantization based techniques are the current state-of-the-art for scaling maximum inner product search to massive databases. Traditional approaches to quantization aim to minimize the reconstruction error of the database points. Based on the observation that for a given query, the database points that have the largest inner products are more relevant, we develop a family of anisotropic quantization loss functions. Under natural statistical assumptions, we show that quantization with these loss functions leads to a new variant of vector quantization that more greatly penalizes the parallel component of a datapoint’s residual relative to its orthogonal component. The proposed approach, whose implementation is open-source, achieves state-of-the-art results on the public benchmarks available at ann-benchmarks.com.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-guo20h, title = {Accelerating Large-Scale Inference with Anisotropic Vector Quantization}, author = {Guo, Ruiqi and Sun, Philip and Lindgren, Erik and Geng, Quan and Simcha, David and Chern, Felix and Kumar, Sanjiv}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {3887--3896}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/guo20h/guo20h.pdf}, url = {http://proceedings.mlr.press/v119/guo20h.html}, abstract = {Quantization based techniques are the current state-of-the-art for scaling maximum inner product search to massive databases. Traditional approaches to quantization aim to minimize the reconstruction error of the database points. Based on the observation that for a given query, the database points that have the largest inner products are more relevant, we develop a family of anisotropic quantization loss functions. Under natural statistical assumptions, we show that quantization with these loss functions leads to a new variant of vector quantization that more greatly penalizes the parallel component of a datapoint’s residual relative to its orthogonal component. The proposed approach, whose implementation is open-source, achieves state-of-the-art results on the public benchmarks available at ann-benchmarks.com.} }
Endnote
%0 Conference Paper %T Accelerating Large-Scale Inference with Anisotropic Vector Quantization %A Ruiqi Guo %A Philip Sun %A Erik Lindgren %A Quan Geng %A David Simcha %A Felix Chern %A Sanjiv Kumar %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-guo20h %I PMLR %P 3887--3896 %U http://proceedings.mlr.press/v119/guo20h.html %V 119 %X Quantization based techniques are the current state-of-the-art for scaling maximum inner product search to massive databases. Traditional approaches to quantization aim to minimize the reconstruction error of the database points. Based on the observation that for a given query, the database points that have the largest inner products are more relevant, we develop a family of anisotropic quantization loss functions. Under natural statistical assumptions, we show that quantization with these loss functions leads to a new variant of vector quantization that more greatly penalizes the parallel component of a datapoint’s residual relative to its orthogonal component. The proposed approach, whose implementation is open-source, achieves state-of-the-art results on the public benchmarks available at ann-benchmarks.com.
APA
Guo, R., Sun, P., Lindgren, E., Geng, Q., Simcha, D., Chern, F. & Kumar, S.. (2020). Accelerating Large-Scale Inference with Anisotropic Vector Quantization. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3887-3896 Available from http://proceedings.mlr.press/v119/guo20h.html.

Related Material