Learning Character-level Representations for Part-of-Speech Tagging

Cicero Dos Santos; Bianca Zadrozny

Learning Character-level Representations for Part-of-Speech Tagging

Cicero Dos Santos, Bianca Zadrozny

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1818-1826, 2014.

Abstract

Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Treebank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-santos14,
  title = 	 {Learning Character-level Representations for Part-of-Speech Tagging},
  author = 	 {Santos, Cicero Dos and Zadrozny, Bianca},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1818--1826},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/santos14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/santos14.html},
  abstract = 	 {Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Treebank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.}
}

Endnote

%0 Conference Paper
%T Learning Character-level Representations for Part-of-Speech Tagging
%A Cicero Dos Santos
%A Bianca Zadrozny
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-santos14
%I PMLR
%P 1818--1826
%U https://proceedings.mlr.press/v32/santos14.html
%V 32
%N 2
%X Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Treebank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.

RIS


TY  - CPAPER
TI  - Learning Character-level Representations for Part-of-Speech Tagging
AU  - Cicero Dos Santos
AU  - Bianca Zadrozny
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-santos14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1818
EP  - 1826
L1  - http://proceedings.mlr.press/v32/santos14.pdf
UR  - https://proceedings.mlr.press/v32/santos14.html
AB  - Distributed word representations have recently been proven to be an invaluable resource for NLP. These representations are normally learned using neural networks and capture syntactic and semantic information about words. Information about word morphology and shape is normally ignored when learning word representations. However, for tasks like part-of-speech tagging, intra-word information is extremely useful, specially when dealing with morphologically rich languages. In this paper, we propose a deep neural network that learns character-level representation of words and associate them with usual word representations to perform POS tagging. Using the proposed approach, while avoiding the use of any handcrafted feature, we produce state-of-the-art POS taggers for two languages: English, with 97.32% accuracy on the Penn Treebank WSJ corpus; and Portuguese, with 97.47% accuracy on the Mac-Morpho corpus, where the latter represents an error reduction of 12.2% on the best previous known result.
ER  -

APA


Santos, C.D. & Zadrozny, B.. (2014). Learning Character-level Representations for Part-of-Speech Tagging. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1818-1826 Available from https://proceedings.mlr.press/v32/santos14.html.

Related Material

Download PDF