Multi-Label Output Codes using Canonical Correlation Analysis

Yi Zhang; Jeff Schneider

Multi-Label Output Codes using Canonical Correlation Analysis

Yi Zhang, Jeff Schneider

Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:873-882, 2011.

Abstract

Traditional error-correcting output codes (ECOCs) decompose a multi-class classification problem into many binary problems. Although it seems natural to use ECOCs for multi-label problems as well, doing so naively creates issues related to: the validity of the encoding, the efficiency of the decoding, the predictability of the generated codeword, and the exploitation of the label dependency. Using canonical correlation analysis, we propose an error-correcting code for multi-label classification. Label dependency is characterized as the most predictable directions in the label space, which are extracted as canonical output variates and encoded into the codeword. Predictions for the codeword define a graphical model of labels with both Bernoulli potentials (from classifiers on the labels) and Gaussian potentials (from regression on the canonical output variates). Decoding is performed by efficient mean-field approximation. We establish connections between the proposed code and research areas such as compressed sensing and ensemble learning. Some of these connections contribute to better understanding of the new code, and others lead to practical improvements in code design. In our empirical study, the proposed code leads to substantial improvements compared to various competitors in music emotion classification and outdoor scene recognition.

Cite this Paper

BibTeX

@InProceedings{pmlr-v15-zhang11c,
  title = 	 {Multi-Label Output Codes using Canonical Correlation Analysis},
  author = 	 {Zhang, Yi and Schneider, Jeff},
  booktitle = 	 {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {873--882},
  year = 	 {2011},
  editor = 	 {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav},
  volume = 	 {15},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Fort Lauderdale, FL, USA},
  month = 	 {11--13 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v15/zhang11c/zhang11c.pdf},
  url = 	 {https://proceedings.mlr.press/v15/zhang11c.html},
  abstract = 	 {Traditional error-correcting output codes (ECOCs) decompose a multi-class classification problem into many binary problems.  Although it seems natural to use ECOCs for multi-label problems as well, doing so naively creates issues related to: the validity of the encoding, the efficiency of the decoding, the predictability of the generated codeword, and the exploitation of the label dependency.  Using canonical correlation analysis, we propose an error-correcting code for multi-label classification. Label dependency is characterized as the most predictable directions in the label space, which are extracted as canonical output variates and encoded into the codeword. Predictions for the codeword define a graphical model of labels with both Bernoulli potentials (from classifiers on the labels) and Gaussian potentials (from regression on the canonical output variates). Decoding is performed by efficient mean-field approximation.  We establish connections between the proposed code and research areas such as compressed sensing and ensemble learning. Some of these connections contribute to better understanding of the new code, and others lead to practical improvements in code design.  In our empirical study, the proposed code leads to substantial improvements compared to various competitors in music emotion classification and outdoor scene recognition.}
}

Endnote

%0 Conference Paper
%T Multi-Label Output Codes using Canonical Correlation Analysis
%A Yi Zhang
%A Jeff Schneider
%B Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2011
%E Geoffrey Gordon
%E David Dunson
%E Miroslav Dudík	
%F pmlr-v15-zhang11c
%I PMLR
%P 873--882
%U https://proceedings.mlr.press/v15/zhang11c.html
%V 15
%X Traditional error-correcting output codes (ECOCs) decompose a multi-class classification problem into many binary problems.  Although it seems natural to use ECOCs for multi-label problems as well, doing so naively creates issues related to: the validity of the encoding, the efficiency of the decoding, the predictability of the generated codeword, and the exploitation of the label dependency.  Using canonical correlation analysis, we propose an error-correcting code for multi-label classification. Label dependency is characterized as the most predictable directions in the label space, which are extracted as canonical output variates and encoded into the codeword. Predictions for the codeword define a graphical model of labels with both Bernoulli potentials (from classifiers on the labels) and Gaussian potentials (from regression on the canonical output variates). Decoding is performed by efficient mean-field approximation.  We establish connections between the proposed code and research areas such as compressed sensing and ensemble learning. Some of these connections contribute to better understanding of the new code, and others lead to practical improvements in code design.  In our empirical study, the proposed code leads to substantial improvements compared to various competitors in music emotion classification and outdoor scene recognition.

RIS

TY  - CPAPER
TI  - Multi-Label Output Codes using Canonical Correlation Analysis
AU  - Yi Zhang
AU  - Jeff Schneider
BT  - Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics
DA  - 2011/06/14
ED  - Geoffrey Gordon
ED  - David Dunson
ED  - Miroslav Dudík	
ID  - pmlr-v15-zhang11c
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 15
SP  - 873
EP  - 882
L1  - http://proceedings.mlr.press/v15/zhang11c/zhang11c.pdf
UR  - https://proceedings.mlr.press/v15/zhang11c.html
AB  - Traditional error-correcting output codes (ECOCs) decompose a multi-class classification problem into many binary problems.  Although it seems natural to use ECOCs for multi-label problems as well, doing so naively creates issues related to: the validity of the encoding, the efficiency of the decoding, the predictability of the generated codeword, and the exploitation of the label dependency.  Using canonical correlation analysis, we propose an error-correcting code for multi-label classification. Label dependency is characterized as the most predictable directions in the label space, which are extracted as canonical output variates and encoded into the codeword. Predictions for the codeword define a graphical model of labels with both Bernoulli potentials (from classifiers on the labels) and Gaussian potentials (from regression on the canonical output variates). Decoding is performed by efficient mean-field approximation.  We establish connections between the proposed code and research areas such as compressed sensing and ensemble learning. Some of these connections contribute to better understanding of the new code, and others lead to practical improvements in code design.  In our empirical study, the proposed code leads to substantial improvements compared to various competitors in music emotion classification and outdoor scene recognition.
ER  -

APA

Zhang, Y. & Schneider, J.. (2011). Multi-Label Output Codes using Canonical Correlation Analysis. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 15:873-882 Available from https://proceedings.mlr.press/v15/zhang11c.html.

Related Material

Download PDF