Classifying New Words for Robust Parsing
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:226-232, 1995.
Robust natural language parsing systems must be able to handle words that are not in their lexicons. This paper describes a statistical classifier that determines the most likely parts of speech of new words. The classifier uses a loglinear model to obtain smoothed conditional probabilities that take into account the interactions between different features. We show accuracy results for this model, and compare it to some simpler methods.