Faster cover trees

Mike Izbicki; Christian Shelton

Faster cover trees

Mike Izbicki, Christian Shelton

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:1162-1170, 2015.

Abstract

The cover tree data structure speeds up exact nearest neighbor queries over arbitrary metric spaces. This paper makes cover trees even faster. In particular, we provide (1) a simpler definition of the cover tree that reduces the number of nodes from O(n) to exactly n, (2) an additional invariant that makes queries faster in practice, (3) algorithms for constructing and querying the tree in parallel on multiprocessor systems, and (4) a more cache efficient memory layout. On standard benchmark datasets, we reduce the number of distance computations by 10–50%. On a large-scale bioinformatics dataset, we reduce the number of distance computations by 71%. On a large-scale image dataset, our parallel algorithm with 16 cores reduces tree construction time from 3.5 hours to 12 minutes.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-izbicki15,
  title = 	 {Faster cover trees},
  author = 	 {Izbicki, Mike and Shelton, Christian},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {1162--1170},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/izbicki15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/izbicki15.html},
  abstract = 	 {The cover tree data structure speeds up exact nearest neighbor queries over arbitrary metric spaces. This paper makes cover trees even faster. In particular, we provide (1) a simpler definition of the cover tree that reduces the number of nodes from O(n) to exactly n, (2) an additional invariant that makes queries faster in practice, (3) algorithms for constructing and querying the tree in parallel on multiprocessor systems, and (4) a more cache efficient memory layout. On standard benchmark datasets, we reduce the number of distance computations by 10–50%. On a large-scale bioinformatics dataset, we reduce the number of distance computations by 71%. On a large-scale image dataset, our parallel algorithm with 16 cores reduces tree construction time from 3.5 hours to 12 minutes.}
}

Endnote

%0 Conference Paper
%T Faster cover trees
%A Mike Izbicki
%A Christian Shelton
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-izbicki15
%I PMLR
%P 1162--1170
%U https://proceedings.mlr.press/v37/izbicki15.html
%V 37
%X The cover tree data structure speeds up exact nearest neighbor queries over arbitrary metric spaces. This paper makes cover trees even faster. In particular, we provide (1) a simpler definition of the cover tree that reduces the number of nodes from O(n) to exactly n, (2) an additional invariant that makes queries faster in practice, (3) algorithms for constructing and querying the tree in parallel on multiprocessor systems, and (4) a more cache efficient memory layout. On standard benchmark datasets, we reduce the number of distance computations by 10–50%. On a large-scale bioinformatics dataset, we reduce the number of distance computations by 71%. On a large-scale image dataset, our parallel algorithm with 16 cores reduces tree construction time from 3.5 hours to 12 minutes.

RIS


TY  - CPAPER
TI  - Faster cover trees
AU  - Mike Izbicki
AU  - Christian Shelton
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-izbicki15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 1162
EP  - 1170
L1  - http://proceedings.mlr.press/v37/izbicki15.pdf
UR  - https://proceedings.mlr.press/v37/izbicki15.html
AB  - The cover tree data structure speeds up exact nearest neighbor queries over arbitrary metric spaces. This paper makes cover trees even faster. In particular, we provide (1) a simpler definition of the cover tree that reduces the number of nodes from O(n) to exactly n, (2) an additional invariant that makes queries faster in practice, (3) algorithms for constructing and querying the tree in parallel on multiprocessor systems, and (4) a more cache efficient memory layout. On standard benchmark datasets, we reduce the number of distance computations by 10–50%. On a large-scale bioinformatics dataset, we reduce the number of distance computations by 71%. On a large-scale image dataset, our parallel algorithm with 16 cores reduces tree construction time from 3.5 hours to 12 minutes.
ER  -

APA


Izbicki, M. & Shelton, C.. (2015). Faster cover trees. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:1162-1170 Available from https://proceedings.mlr.press/v37/izbicki15.html.

Related Material

Download PDF