Fast Classification with Binary Prototypes

Kai Zhong; Ruiqi Guo; Sanjiv Kumar; Bowei Yan; David Simcha; Inderjit Dhillon

Fast Classification with Binary Prototypes

Kai Zhong, Ruiqi Guo, Sanjiv Kumar, Bowei Yan, David Simcha, Inderjit Dhillon

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:1255-1263, 2017.

Abstract

In this work, we propose a new technique for \emphfast k-nearest neighbor (k-NN) classification in which the original database is represented via a small set of learned binary prototypes. The training phase simultaneously learns a hash function which maps the data points to binary codes, and a set of representative binary prototypes. In the prediction phase, we first hash the query into a binary code and then do the k-NN classification using the binary prototypes as the database. Our approach speeds up k-NN classification in two aspects. First, we compress the database into a smaller set of prototypes such that k-NN search only goes through a smaller set rather than the whole dataset. Second, we reduce the original space to a compact binary embedding, where the Hamming distance between two binary codes is very efficient to compute. We propose a formulation to learn the hash function and prototypes such that the classification error is minimized. We also provide a novel theoretical analysis of the proposed technique in terms of Bayes error consistency. Empirically, our method is much faster than the state-of-the-art k-NN compression methods with comparable accuracy.

Cite this Paper

BibTeX

@InProceedings{pmlr-v54-zhong17a,
  title = 	 {{Fast Classification with Binary Prototypes}},
  author = 	 {Zhong, Kai and Guo, Ruiqi and Kumar, Sanjiv and Yan, Bowei and Simcha, David and Dhillon, Inderjit},
  booktitle = 	 {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1255--1263},
  year = 	 {2017},
  editor = 	 {Singh, Aarti and Zhu, Jerry},
  volume = 	 {54},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--22 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v54/zhong17a/zhong17a.pdf},
  url = 	 {https://proceedings.mlr.press/v54/zhong17a.html},
  abstract = 	 {In this work, we propose a new technique for \emphfast k-nearest neighbor (k-NN) classification in which the original database is represented via a small set of learned binary prototypes. The training phase simultaneously learns a hash function which maps the data points to binary codes, and a set of representative binary prototypes. In the prediction phase, we first hash the query into a binary code and then do the k-NN classification using the binary prototypes as the database. Our approach speeds up k-NN classification in two aspects. First, we compress the database into a smaller set of prototypes such that k-NN search only goes through a smaller set rather than the whole dataset. Second, we reduce the original space to a compact binary embedding, where the Hamming distance between two binary codes is very efficient to compute. We propose a formulation to learn the hash function and prototypes such that the classification error is minimized. We also provide a novel theoretical analysis of the proposed technique in terms of Bayes error consistency. Empirically, our method is much faster than the state-of-the-art k-NN compression methods with comparable accuracy.}
}

Endnote

%0 Conference Paper
%T Fast Classification with Binary Prototypes
%A Kai Zhong
%A Ruiqi Guo
%A Sanjiv Kumar
%A Bowei Yan
%A David Simcha
%A Inderjit Dhillon
%B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2017
%E Aarti Singh
%E Jerry Zhu	
%F pmlr-v54-zhong17a
%I PMLR
%P 1255--1263
%U https://proceedings.mlr.press/v54/zhong17a.html
%V 54
%X In this work, we propose a new technique for \emphfast k-nearest neighbor (k-NN) classification in which the original database is represented via a small set of learned binary prototypes. The training phase simultaneously learns a hash function which maps the data points to binary codes, and a set of representative binary prototypes. In the prediction phase, we first hash the query into a binary code and then do the k-NN classification using the binary prototypes as the database. Our approach speeds up k-NN classification in two aspects. First, we compress the database into a smaller set of prototypes such that k-NN search only goes through a smaller set rather than the whole dataset. Second, we reduce the original space to a compact binary embedding, where the Hamming distance between two binary codes is very efficient to compute. We propose a formulation to learn the hash function and prototypes such that the classification error is minimized. We also provide a novel theoretical analysis of the proposed technique in terms of Bayes error consistency. Empirically, our method is much faster than the state-of-the-art k-NN compression methods with comparable accuracy.

APA

Zhong, K., Guo, R., Kumar, S., Yan, B., Simcha, D. & Dhillon, I.. (2017). Fast Classification with Binary Prototypes. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:1255-1263 Available from https://proceedings.mlr.press/v54/zhong17a.html.

Fast Classification with Binary Prototypes

Abstract

Cite this Paper

Related Material