Multi-Label Classification with Unlabeled Data: An Inductive Approach
Proceedings of the 5th Asian Conference on Machine Learning, PMLR 29:197-212, 2013.
The problem of multi-label classification has attracted great interests in the last decade. Multi-label classification refers to the problems where an example that is represented by a \emphsingle instance can be assigned to \emphmore than one category. Until now, most of the researches on multi-label classification have focused on supervised settings whose assumption is that large amount of labeled training data is available. Unfortunately, labeling training example is expensive and time-consuming, especially when it has more than one label. However, in many cases abundant unlabeled data is easy to obtain. Current attempts toward exploiting unlabeled data for multi-label classification work under the \emphtransductive setting, which aim at making predictions on existing unlabeled data while can not generalize to new unseen data. In this paper, the problem of \emphinductive semi-supervised multi-label classification is studied, where a new approach named \textsliMLCU, i.e. \emphinductive Multi-Label Classification with Unlabeled data, is proposed. We formulate the inductive semi-supervised multi-label learning as an optimization problem of learning linear models and ConCave Convex Procedure \textsl(CCCP) is applied to optimize the non-convex optimization problem. Empirical studies on twelve diversified real-word multi-label learning tasks clearly validate the superiority of \textsliMLCU against the other well-established multi-label learning approaches.