Discovering Morphemic Suffixes A Case Study In MDL Induction
Pre-proceedings of the Fifth International Workshop on Artificial Intelligence and Statistics, PMLR R0:64-75, 1995.
This paper reports experiments in the automatic discovery of linguistically significant regularities in text. The minimum description length principle is exploited to evaluate linguistic hypotheses with respect to a corpus and a theory of the types of regularities to be found in it. The domain of inquiry in this paper is the discovery of morphemic suffixes such as English -ing and -ly, but the technique is widely applicable to language learning problems.