Approximate Nearest Neighbor Search with Window Filters

Joshua Engels, Ben Landrum, Shangdi Yu, Laxman Dhulipala, Julian Shun
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:12469-12490, 2024.

Abstract

We define and investigate the problem of c-approximate window search: approximate nearest neighbor search where each point in the dataset has a numeric label, and the goal is to find nearest neighbors to queries within arbitrary label ranges. Many semantic search problems, such as image and document search with timestamp filters, or product search with cost filters, are natural examples of this problem. We propose and theoretically analyze a modular tree-based framework for transforming an index that solves the traditional c-approximate nearest neighbor problem into a data structure that solves window search. On standard nearest neighbor benchmark datasets equipped with random label values, adversarially constructed embeddings, and image search embeddings with real timestamps, we obtain up to a $75\times$ speedup over existing solutions at the same level of recall.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-engels24a, title = {Approximate Nearest Neighbor Search with Window Filters}, author = {Engels, Joshua and Landrum, Ben and Yu, Shangdi and Dhulipala, Laxman and Shun, Julian}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {12469--12490}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/engels24a/engels24a.pdf}, url = {https://proceedings.mlr.press/v235/engels24a.html}, abstract = {We define and investigate the problem of c-approximate window search: approximate nearest neighbor search where each point in the dataset has a numeric label, and the goal is to find nearest neighbors to queries within arbitrary label ranges. Many semantic search problems, such as image and document search with timestamp filters, or product search with cost filters, are natural examples of this problem. We propose and theoretically analyze a modular tree-based framework for transforming an index that solves the traditional c-approximate nearest neighbor problem into a data structure that solves window search. On standard nearest neighbor benchmark datasets equipped with random label values, adversarially constructed embeddings, and image search embeddings with real timestamps, we obtain up to a $75\times$ speedup over existing solutions at the same level of recall.} }
Endnote
%0 Conference Paper %T Approximate Nearest Neighbor Search with Window Filters %A Joshua Engels %A Ben Landrum %A Shangdi Yu %A Laxman Dhulipala %A Julian Shun %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-engels24a %I PMLR %P 12469--12490 %U https://proceedings.mlr.press/v235/engels24a.html %V 235 %X We define and investigate the problem of c-approximate window search: approximate nearest neighbor search where each point in the dataset has a numeric label, and the goal is to find nearest neighbors to queries within arbitrary label ranges. Many semantic search problems, such as image and document search with timestamp filters, or product search with cost filters, are natural examples of this problem. We propose and theoretically analyze a modular tree-based framework for transforming an index that solves the traditional c-approximate nearest neighbor problem into a data structure that solves window search. On standard nearest neighbor benchmark datasets equipped with random label values, adversarially constructed embeddings, and image search embeddings with real timestamps, we obtain up to a $75\times$ speedup over existing solutions at the same level of recall.
APA
Engels, J., Landrum, B., Yu, S., Dhulipala, L. & Shun, J.. (2024). Approximate Nearest Neighbor Search with Window Filters. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:12469-12490 Available from https://proceedings.mlr.press/v235/engels24a.html.

Related Material