Candidates vs. Noises Estimation for Large Multi-Class Classification Problem

Lei Han, Yiheng Huang, Tong Zhang
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1890-1899, 2018.

Abstract

This paper proposes a method for multi-class classification problems, where the number of classes K is large. The method, referred to as Candidates vs. Noises Estimation (CANE), selects a small subset of candidate classes and samples the remaining classes. We show that CANE is always consistent and computationally efficient. Moreover, the resulting estimator has low statistical variance approaching that of the maximum likelihood estimator, when the observed label belongs to the selected candidates with high probability. In practice, we use a tree structure with leaves as classes to promote fast beam search for candidate selection. We further apply the CANE method to estimate word probabilities in learning large neural language models. Extensive experimental results show that CANE achieves better prediction accuracy over the Noise-Contrastive Estimation (NCE), its variants and a number of the state-of-the-art tree classifiers, while it gains significant speedup compared to standard O(K) methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-han18a, title = {Candidates vs. Noises Estimation for Large Multi-Class Classification Problem}, author = {Han, Lei and Huang, Yiheng and Zhang, Tong}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1890--1899}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/han18a/han18a.pdf}, url = {https://proceedings.mlr.press/v80/han18a.html}, abstract = {This paper proposes a method for multi-class classification problems, where the number of classes K is large. The method, referred to as Candidates vs. Noises Estimation (CANE), selects a small subset of candidate classes and samples the remaining classes. We show that CANE is always consistent and computationally efficient. Moreover, the resulting estimator has low statistical variance approaching that of the maximum likelihood estimator, when the observed label belongs to the selected candidates with high probability. In practice, we use a tree structure with leaves as classes to promote fast beam search for candidate selection. We further apply the CANE method to estimate word probabilities in learning large neural language models. Extensive experimental results show that CANE achieves better prediction accuracy over the Noise-Contrastive Estimation (NCE), its variants and a number of the state-of-the-art tree classifiers, while it gains significant speedup compared to standard O(K) methods.} }
Endnote
%0 Conference Paper %T Candidates vs. Noises Estimation for Large Multi-Class Classification Problem %A Lei Han %A Yiheng Huang %A Tong Zhang %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-han18a %I PMLR %P 1890--1899 %U https://proceedings.mlr.press/v80/han18a.html %V 80 %X This paper proposes a method for multi-class classification problems, where the number of classes K is large. The method, referred to as Candidates vs. Noises Estimation (CANE), selects a small subset of candidate classes and samples the remaining classes. We show that CANE is always consistent and computationally efficient. Moreover, the resulting estimator has low statistical variance approaching that of the maximum likelihood estimator, when the observed label belongs to the selected candidates with high probability. In practice, we use a tree structure with leaves as classes to promote fast beam search for candidate selection. We further apply the CANE method to estimate word probabilities in learning large neural language models. Extensive experimental results show that CANE achieves better prediction accuracy over the Noise-Contrastive Estimation (NCE), its variants and a number of the state-of-the-art tree classifiers, while it gains significant speedup compared to standard O(K) methods.
APA
Han, L., Huang, Y. & Zhang, T.. (2018). Candidates vs. Noises Estimation for Large Multi-Class Classification Problem. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1890-1899 Available from https://proceedings.mlr.press/v80/han18a.html.

Related Material