Multiclass-Multilabel Classification with More Classes than Examples

Ofer Dekel; Ohad Shamir

Multiclass-Multilabel Classification with More Classes than Examples

Ofer Dekel, Ohad Shamir

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:137-144, 2010.

Abstract

We discuss multiclass-multilabel classification problems in which the set of possible labels is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples with a given label. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of label frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambitious task of categorizing the entire web using the 1.5 million categories defined on Wikipedia.

Cite this Paper

BibTeX


@InProceedings{pmlr-v9-dekel10a,
  title = 	 {Multiclass-Multilabel Classification with More Classes than Examples},
  author = 	 {Dekel, Ofer and Shamir, Ohad},
  booktitle = 	 {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {137--144},
  year = 	 {2010},
  editor = 	 {Teh, Yee Whye and Titterington, Mike},
  volume = 	 {9},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Chia Laguna Resort, Sardinia, Italy},
  month = 	 {13--15 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v9/dekel10a/dekel10a.pdf},
  url = 	 {https://proceedings.mlr.press/v9/dekel10a.html},
  abstract = 	 {We discuss multiclass-multilabel classification problems in which   the set of possible labels is extremely large. Most existing   multiclass-multilabel learning algorithms expect to observe a   reasonably large sample from each class, and fail if they receive   only a handful of examples with a given label. We propose and   analyze the following two-stage approach: first use an arbitrary   (perhaps heuristic) classification algorithm to construct an initial   classifier, then apply a simple but principled method to augment   this classifier by removing harmful labels from its output.  A   careful theoretical analysis allows us to justify our approach under   some reasonable conditions (such as label sparsity and power-law   distribution of label frequencies), even when the training set does   not provide a statistically accurate representation of most   classes. Surprisingly, our theoretical analysis continues to hold   even when the number of classes exceeds the sample size. We   demonstrate the merits of our approach on the ambitious task of   categorizing the entire web using the 1.5 million categories   defined on Wikipedia.}
}

Endnote

%0 Conference Paper
%T Multiclass-Multilabel Classification with More Classes than Examples
%A Ofer Dekel
%A Ohad Shamir
%B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2010
%E Yee Whye Teh
%E Mike Titterington	
%F pmlr-v9-dekel10a
%I PMLR
%P 137--144
%U https://proceedings.mlr.press/v9/dekel10a.html
%V 9
%X We discuss multiclass-multilabel classification problems in which   the set of possible labels is extremely large. Most existing   multiclass-multilabel learning algorithms expect to observe a   reasonably large sample from each class, and fail if they receive   only a handful of examples with a given label. We propose and   analyze the following two-stage approach: first use an arbitrary   (perhaps heuristic) classification algorithm to construct an initial   classifier, then apply a simple but principled method to augment   this classifier by removing harmful labels from its output.  A   careful theoretical analysis allows us to justify our approach under   some reasonable conditions (such as label sparsity and power-law   distribution of label frequencies), even when the training set does   not provide a statistically accurate representation of most   classes. Surprisingly, our theoretical analysis continues to hold   even when the number of classes exceeds the sample size. We   demonstrate the merits of our approach on the ambitious task of   categorizing the entire web using the 1.5 million categories   defined on Wikipedia.

RIS


TY  - CPAPER
TI  - Multiclass-Multilabel Classification with More Classes than Examples
AU  - Ofer Dekel
AU  - Ohad Shamir
BT  - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
DA  - 2010/03/31
ED  - Yee Whye Teh
ED  - Mike Titterington	
ID  - pmlr-v9-dekel10a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 9
SP  - 137
EP  - 144
L1  - http://proceedings.mlr.press/v9/dekel10a/dekel10a.pdf
UR  - https://proceedings.mlr.press/v9/dekel10a.html
AB  - We discuss multiclass-multilabel classification problems in which   the set of possible labels is extremely large. Most existing   multiclass-multilabel learning algorithms expect to observe a   reasonably large sample from each class, and fail if they receive   only a handful of examples with a given label. We propose and   analyze the following two-stage approach: first use an arbitrary   (perhaps heuristic) classification algorithm to construct an initial   classifier, then apply a simple but principled method to augment   this classifier by removing harmful labels from its output.  A   careful theoretical analysis allows us to justify our approach under   some reasonable conditions (such as label sparsity and power-law   distribution of label frequencies), even when the training set does   not provide a statistically accurate representation of most   classes. Surprisingly, our theoretical analysis continues to hold   even when the number of classes exceeds the sample size. We   demonstrate the merits of our approach on the ambitious task of   categorizing the entire web using the 1.5 million categories   defined on Wikipedia.
ER  -

APA


Dekel, O. & Shamir, O.. (2010). Multiclass-Multilabel Classification with More Classes than Examples. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:137-144 Available from https://proceedings.mlr.press/v9/dekel10a.html.

Related Material

Download PDF