Efficient Multi-label Classification with Many Labels

Wei Bi; James Kwok

Efficient Multi-label Classification with Many Labels

Wei Bi, James Kwok

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):405-413, 2013.

Abstract

Multi-label classification deals with the problem where each instance can be associated with a set of class labels. However, in many real-world applications, the number of class labels can be in the hundreds or even thousands, and existing multi-label classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. However, they are either based on simple dimension reduction techniques or involve expensive optimization problems. In this paper, we address this problem by selecting a small subset of class labels that can approximately span the original label space. This is performed by randomized sampling where the sampling probability of each class label reflects its importance among all the labels. Theoretical analysis shows that this randomized sampling approach is highly efficient. Experiments on a number of real-world multi-label datasets with many labels demonstrate the appealing performance and efficiency of the proposed algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-bi13,
  title = 	 {Efficient Multi-label Classification with Many Labels},
  author = 	 {Bi, Wei and Kwok, James},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {405--413},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/bi13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/bi13.html},
  abstract = 	 {Multi-label classification deals with the problem where each instance can be associated with a set of class labels. However, in many real-world applications, the number of class labels can be in the hundreds or even thousands, and existing multi-label classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. However, they are either based on simple dimension reduction techniques or involve expensive optimization problems. In this paper, we address this problem by selecting a small subset of class labels that can approximately span the original label space. This is performed by randomized sampling where the sampling probability of each class label reflects its importance among all the labels. Theoretical analysis shows that this randomized sampling approach is highly efficient. Experiments on a number of real-world multi-label datasets with many labels demonstrate the appealing performance and efficiency of the proposed algorithm.  }
}

Endnote

%0 Conference Paper
%T Efficient Multi-label Classification with Many Labels
%A Wei Bi
%A James Kwok
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-bi13
%I PMLR
%P 405--413
%U https://proceedings.mlr.press/v28/bi13.html
%V 28
%N 3
%X Multi-label classification deals with the problem where each instance can be associated with a set of class labels. However, in many real-world applications, the number of class labels can be in the hundreds or even thousands, and existing multi-label classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. However, they are either based on simple dimension reduction techniques or involve expensive optimization problems. In this paper, we address this problem by selecting a small subset of class labels that can approximately span the original label space. This is performed by randomized sampling where the sampling probability of each class label reflects its importance among all the labels. Theoretical analysis shows that this randomized sampling approach is highly efficient. Experiments on a number of real-world multi-label datasets with many labels demonstrate the appealing performance and efficiency of the proposed algorithm.

RIS


TY  - CPAPER
TI  - Efficient Multi-label Classification with Many Labels
AU  - Wei Bi
AU  - James Kwok
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-bi13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 405
EP  - 413
L1  - http://proceedings.mlr.press/v28/bi13.pdf
UR  - https://proceedings.mlr.press/v28/bi13.html
AB  - Multi-label classification deals with the problem where each instance can be associated with a set of class labels. However, in many real-world applications, the number of class labels can be in the hundreds or even thousands, and existing multi-label classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. However, they are either based on simple dimension reduction techniques or involve expensive optimization problems. In this paper, we address this problem by selecting a small subset of class labels that can approximately span the original label space. This is performed by randomized sampling where the sampling probability of each class label reflects its importance among all the labels. Theoretical analysis shows that this randomized sampling approach is highly efficient. Experiments on a number of real-world multi-label datasets with many labels demonstrate the appealing performance and efficiency of the proposed algorithm.  
ER  -

APA


Bi, W. & Kwok, J.. (2013). Efficient Multi-label Classification with Many Labels. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):405-413 Available from https://proceedings.mlr.press/v28/bi13.html.

Related Material

Download PDF