Grammar Induction: Beyond Local Search
Proceedings of the Eleventh International Conference on Grammatical Inference, PMLR 21:112-113, 2012.
Many approaches to probabilistic grammar induction operate by iteratively improving a single grammar, beginning with an initial guess. These local search paradigms include (variational) EM, MCMC, and greedy model merging or splitting procedures. Unfortunately, local search methods tend to get caught in local optima, even with random restarts. Two approaches are outlined that try to avoid this problem. One uses branch-and-bound methods from mathematical programming to eliminate regions of parameter space that cannot contain the global optimum. The other is inspired by recent work on deep learning, and uses spectral methods to build up featural representations of all substrings, without premature commitment to which substrings are constituents.