Multiclass-Multilabel Classification with More Classes than Examples

Ofer Dekel, Ohad Shamir
; Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings 9:137-144, 2010.

Abstract

We discuss multiclass-multilabel classification problems in which the set of possible labels is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples with a given label. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of label frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambitious task of categorizing the entire web using the 1.5 million categories defined on Wikipedia.

Cite this Paper


BibTeX
@InProceedings{pmlr-v9-dekel10a, title = {Multiclass-Multilabel Classification with More Classes than Examples}, author = {Ofer Dekel and Ohad Shamir}, booktitle = {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics}, pages = {137--144}, year = {2010}, editor = {Yee Whye Teh and Mike Titterington}, volume = {9}, series = {Proceedings of Machine Learning Research}, address = {Chia Laguna Resort, Sardinia, Italy}, month = {13--15 May}, publisher = {JMLR Workshop and Conference Proceedings}, pdf = {http://proceedings.mlr.press/v9/dekel10a/dekel10a.pdf}, url = {http://proceedings.mlr.press/v9/dekel10a.html}, abstract = {We discuss multiclass-multilabel classification problems in which the set of possible labels is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples with a given label. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of label frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambitious task of categorizing the entire web using the 1.5 million categories defined on Wikipedia.} }
Endnote
%0 Conference Paper %T Multiclass-Multilabel Classification with More Classes than Examples %A Ofer Dekel %A Ohad Shamir %B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2010 %E Yee Whye Teh %E Mike Titterington %F pmlr-v9-dekel10a %I PMLR %J Proceedings of Machine Learning Research %P 137--144 %U http://proceedings.mlr.press %V 9 %W PMLR %X We discuss multiclass-multilabel classification problems in which the set of possible labels is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples with a given label. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of label frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambitious task of categorizing the entire web using the 1.5 million categories defined on Wikipedia.
RIS
TY - CPAPER TI - Multiclass-Multilabel Classification with More Classes than Examples AU - Ofer Dekel AU - Ohad Shamir BT - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics PY - 2010/03/31 DA - 2010/03/31 ED - Yee Whye Teh ED - Mike Titterington ID - pmlr-v9-dekel10a PB - PMLR SP - 137 DP - PMLR EP - 144 L1 - http://proceedings.mlr.press/v9/dekel10a/dekel10a.pdf UR - http://proceedings.mlr.press/v9/dekel10a.html AB - We discuss multiclass-multilabel classification problems in which the set of possible labels is extremely large. Most existing multiclass-multilabel learning algorithms expect to observe a reasonably large sample from each class, and fail if they receive only a handful of examples with a given label. We propose and analyze the following two-stage approach: first use an arbitrary (perhaps heuristic) classification algorithm to construct an initial classifier, then apply a simple but principled method to augment this classifier by removing harmful labels from its output. A careful theoretical analysis allows us to justify our approach under some reasonable conditions (such as label sparsity and power-law distribution of label frequencies), even when the training set does not provide a statistically accurate representation of most classes. Surprisingly, our theoretical analysis continues to hold even when the number of classes exceeds the sample size. We demonstrate the merits of our approach on the ambitious task of categorizing the entire web using the 1.5 million categories defined on Wikipedia. ER -
APA
Dekel, O. & Shamir, O.. (2010). Multiclass-Multilabel Classification with More Classes than Examples. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in PMLR 9:137-144

Related Material