Fast and Bayes-consistent nearest neighbors

Klim Efremenko, Aryeh Kontorovich, Moshe Noivirt
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1276-1286, 2020.

Abstract

Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects – either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm’s prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size $n$ in $\R^d$, our pre-processing phase has runtime $O(d n \log n)$, while the evaluation phase has runtime $O(d\log n)$ per query point.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-efremenko20a, title = {Fast and Bayes-consistent nearest neighbors}, author = {Efremenko, Klim and Kontorovich, Aryeh and Noivirt, Moshe}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {1276--1286}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/efremenko20a/efremenko20a.pdf}, url = { http://proceedings.mlr.press/v108/efremenko20a.html }, abstract = {Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects – either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm’s prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size $n$ in $\R^d$, our pre-processing phase has runtime $O(d n \log n)$, while the evaluation phase has runtime $O(d\log n)$ per query point.} }
Endnote
%0 Conference Paper %T Fast and Bayes-consistent nearest neighbors %A Klim Efremenko %A Aryeh Kontorovich %A Moshe Noivirt %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-efremenko20a %I PMLR %P 1276--1286 %U http://proceedings.mlr.press/v108/efremenko20a.html %V 108 %X Research on nearest-neighbor methods tends to focus somewhat dichotomously either on the statistical or the computational aspects – either on, say, Bayes consistency and rates of convergence or on techniques for speeding up the proximity search. This paper aims at bridging these realms: to reap the advantages of fast evaluation time while maintaining Bayes consistency, and further without sacrificing too much in the risk decay rate. We combine the locality-sensitive hashing (LSH) technique with a novel missing-mass argument to obtain a fast and Bayes-consistent classifier. Our algorithm’s prediction runtime compares favorably against state of the art approximate NN methods, while maintaining Bayes-consistency and attaining rates comparable to minimax. On samples of size $n$ in $\R^d$, our pre-processing phase has runtime $O(d n \log n)$, while the evaluation phase has runtime $O(d\log n)$ per query point.
APA
Efremenko, K., Kontorovich, A. & Noivirt, M.. (2020). Fast and Bayes-consistent nearest neighbors. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:1276-1286 Available from http://proceedings.mlr.press/v108/efremenko20a.html .

Related Material