[edit]
ML-NCA: Multi-label Neighbourhood Component Analysis
Proceedings of the Third International Workshop on Learning with Imbalanced Domains: Theory and Applications, PMLR 154:35-48, 2021.
Abstract
In multi-label classification, a datapoint can be assigned to more than one class simultaneously. Input space transformation methods can be used to transform the input space so that classification algorithms can perform better. Although existing algorithms used in binary or multi-class classifications can be used with multi-label datasets, this leads to one transformation per label and hence is very costly. Also, considering each label independently ignores consideration of any label associations in the transformation process which is a missed opportunity. In this work, a new input space transformation algorithm, Multi-label Neighbourhood Component Analysis (ML-NCA), is proposed. ML-NCA performs one single linear transformation of the input space in a supervised fashion, that transforms to a space in which $k$ nearest-neighbour based algorithms are expected to perform well. ML-NCA considers all the labels together while finding the single transformation of the input space, therefore omitting the need for per-label transformations. This also implicitly takes advantage of label associations. An extensive set of experiments and detailed analysis demonstrate that the transformation found by ML-NCA is able to significantly improve the performance of multi-label-specific $k$ nearest neighbour algorithms.