Unsupervised Aggregation for Classification Problems with Large Numbers of Categories

Ivan Titov; Alexandre Klementiev; Kevin Small; Dan Roth

Unsupervised Aggregation for Classification Problems with Large Numbers of Categories

Ivan Titov, Alexandre Klementiev, Kevin Small, Dan Roth

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:836-843, 2010.

Abstract

Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a decision-maker to combine predictions from various sources. However, supervised data needed to fit an aggregation model is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal to estimate the expertise of individual judges. Due to the large output space size, this aggregation model cannot encode expertise of constituent judges with respect to every category for all problems. Consequently, we extend it by incorporating the notion of category types to account for variability of the judge expertise depending on the type. The viability of our approach is demonstrated both on synthetic experiments and on a practical task of syntactic parser aggregation.

Cite this Paper

BibTeX


@InProceedings{pmlr-v9-titov10a,
  title = 	 {Unsupervised Aggregation for Classification Problems with Large Numbers of Categories},
  author = 	 {Titov, Ivan and Klementiev, Alexandre and Small, Kevin and Roth, Dan},
  booktitle = 	 {Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics},
  pages = 	 {836--843},
  year = 	 {2010},
  editor = 	 {Teh, Yee Whye and Titterington, Mike},
  volume = 	 {9},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Chia Laguna Resort, Sardinia, Italy},
  month = 	 {13--15 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v9/titov10a/titov10a.pdf},
  url = 	 {https://proceedings.mlr.press/v9/titov10a.html},
  abstract = 	 {Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a  decision-maker to  combine predictions from various  sources.  However, supervised data needed to fit an aggregation model  is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal to estimate the expertise of individual judges.  Due to the large output space size, this aggregation model cannot encode expertise of constituent judges with respect to every category for all problems. Consequently, we extend it by incorporating the notion of category types  to account for variability  of the judge expertise depending on the type.  The viability of our approach is demonstrated both on synthetic experiments and on a practical task of syntactic parser aggregation.}
}

Endnote

%0 Conference Paper
%T Unsupervised Aggregation for Classification Problems with Large Numbers of Categories
%A Ivan Titov
%A Alexandre Klementiev
%A Kevin Small
%A Dan Roth
%B Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2010
%E Yee Whye Teh
%E Mike Titterington	
%F pmlr-v9-titov10a
%I PMLR
%P 836--843
%U https://proceedings.mlr.press/v9/titov10a.html
%V 9
%X Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a  decision-maker to  combine predictions from various  sources.  However, supervised data needed to fit an aggregation model  is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal to estimate the expertise of individual judges.  Due to the large output space size, this aggregation model cannot encode expertise of constituent judges with respect to every category for all problems. Consequently, we extend it by incorporating the notion of category types  to account for variability  of the judge expertise depending on the type.  The viability of our approach is demonstrated both on synthetic experiments and on a practical task of syntactic parser aggregation.

RIS


TY  - CPAPER
TI  - Unsupervised Aggregation for Classification Problems with Large Numbers of Categories
AU  - Ivan Titov
AU  - Alexandre Klementiev
AU  - Kevin Small
AU  - Dan Roth
BT  - Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics
DA  - 2010/03/31
ED  - Yee Whye Teh
ED  - Mike Titterington	
ID  - pmlr-v9-titov10a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 9
SP  - 836
EP  - 843
L1  - http://proceedings.mlr.press/v9/titov10a/titov10a.pdf
UR  - https://proceedings.mlr.press/v9/titov10a.html
AB  - Classification problems with a very large or unbounded set of output categories are common in many areas such as natural language and image processing. In order to improve accuracy on these tasks, it is natural for a  decision-maker to  combine predictions from various  sources.  However, supervised data needed to fit an aggregation model  is often difficult to obtain, especially if needed for multiple domains. Therefore, we propose a generative model for unsupervised aggregation which exploits the agreement signal to estimate the expertise of individual judges.  Due to the large output space size, this aggregation model cannot encode expertise of constituent judges with respect to every category for all problems. Consequently, we extend it by incorporating the notion of category types  to account for variability  of the judge expertise depending on the type.  The viability of our approach is demonstrated both on synthetic experiments and on a practical task of syntactic parser aggregation.
ER  -

APA


Titov, I., Klementiev, A., Small, K. & Roth, D.. (2010). Unsupervised Aggregation for Classification Problems with Large Numbers of Categories. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 9:836-843 Available from https://proceedings.mlr.press/v9/titov10a.html.

Related Material

Download PDF