Fast Classification with Binary Prototypes

Kai Zhong, Ruiqi Guo, Sanjiv Kumar, Bowei Yan, David Simcha, Inderjit Dhillon
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:1255-1263, 2017.

Abstract

In this work, we propose a new technique for \emphfast k-nearest neighbor (k-NN) classification in which the original database is represented via a small set of learned binary prototypes. The training phase simultaneously learns a hash function which maps the data points to binary codes, and a set of representative binary prototypes. In the prediction phase, we first hash the query into a binary code and then do the k-NN classification using the binary prototypes as the database. Our approach speeds up k-NN classification in two aspects. First, we compress the database into a smaller set of prototypes such that k-NN search only goes through a smaller set rather than the whole dataset. Second, we reduce the original space to a compact binary embedding, where the Hamming distance between two binary codes is very efficient to compute. We propose a formulation to learn the hash function and prototypes such that the classification error is minimized. We also provide a novel theoretical analysis of the proposed technique in terms of Bayes error consistency. Empirically, our method is much faster than the state-of-the-art k-NN compression methods with comparable accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-zhong17a, title = {{Fast Classification with Binary Prototypes}}, author = {Zhong, Kai and Guo, Ruiqi and Kumar, Sanjiv and Yan, Bowei and Simcha, David and Dhillon, Inderjit}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {1255--1263}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/zhong17a/zhong17a.pdf}, url = {https://proceedings.mlr.press/v54/zhong17a.html}, abstract = {In this work, we propose a new technique for \emphfast k-nearest neighbor (k-NN) classification in which the original database is represented via a small set of learned binary prototypes. The training phase simultaneously learns a hash function which maps the data points to binary codes, and a set of representative binary prototypes. In the prediction phase, we first hash the query into a binary code and then do the k-NN classification using the binary prototypes as the database. Our approach speeds up k-NN classification in two aspects. First, we compress the database into a smaller set of prototypes such that k-NN search only goes through a smaller set rather than the whole dataset. Second, we reduce the original space to a compact binary embedding, where the Hamming distance between two binary codes is very efficient to compute. We propose a formulation to learn the hash function and prototypes such that the classification error is minimized. We also provide a novel theoretical analysis of the proposed technique in terms of Bayes error consistency. Empirically, our method is much faster than the state-of-the-art k-NN compression methods with comparable accuracy.} }
Endnote
%0 Conference Paper %T Fast Classification with Binary Prototypes %A Kai Zhong %A Ruiqi Guo %A Sanjiv Kumar %A Bowei Yan %A David Simcha %A Inderjit Dhillon %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-zhong17a %I PMLR %P 1255--1263 %U https://proceedings.mlr.press/v54/zhong17a.html %V 54 %X In this work, we propose a new technique for \emphfast k-nearest neighbor (k-NN) classification in which the original database is represented via a small set of learned binary prototypes. The training phase simultaneously learns a hash function which maps the data points to binary codes, and a set of representative binary prototypes. In the prediction phase, we first hash the query into a binary code and then do the k-NN classification using the binary prototypes as the database. Our approach speeds up k-NN classification in two aspects. First, we compress the database into a smaller set of prototypes such that k-NN search only goes through a smaller set rather than the whole dataset. Second, we reduce the original space to a compact binary embedding, where the Hamming distance between two binary codes is very efficient to compute. We propose a formulation to learn the hash function and prototypes such that the classification error is minimized. We also provide a novel theoretical analysis of the proposed technique in terms of Bayes error consistency. Empirically, our method is much faster than the state-of-the-art k-NN compression methods with comparable accuracy.
APA
Zhong, K., Guo, R., Kumar, S., Yan, B., Simcha, D. & Dhillon, I.. (2017). Fast Classification with Binary Prototypes. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:1255-1263 Available from https://proceedings.mlr.press/v54/zhong17a.html.

Related Material