Low-Rank Approximation of Weighted Tree Automata

Guillaume Rabusseau, Borja Balle, Shay Cohen
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:839-847, 2016.

Abstract

We describe a technique to minimize weighted tree automata (WTA), a powerful formalisms that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs. Our method relies on a singular value decomposition of the underlying Hankel matrix defined by the WTA. Our main theoretical result is an efficient algorithm for computing the SVD of an infinite Hankel matrix implicitly represented as a WTA. We evaluate our method on real-world data originating in newswire treebank. We show that the model achieves lower perplexity than previous methods for PCFG minimization, and also is much more stable due to the absence of local optima.

Cite this Paper


BibTeX
@InProceedings{pmlr-v51-rabusseau16, title = {Low-Rank Approximation of Weighted Tree Automata}, author = {Rabusseau, Guillaume and Balle, Borja and Cohen, Shay}, booktitle = {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics}, pages = {839--847}, year = {2016}, editor = {Gretton, Arthur and Robert, Christian C.}, volume = {51}, series = {Proceedings of Machine Learning Research}, address = {Cadiz, Spain}, month = {09--11 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v51/rabusseau16.pdf}, url = {https://proceedings.mlr.press/v51/rabusseau16.html}, abstract = {We describe a technique to minimize weighted tree automata (WTA), a powerful formalisms that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs. Our method relies on a singular value decomposition of the underlying Hankel matrix defined by the WTA. Our main theoretical result is an efficient algorithm for computing the SVD of an infinite Hankel matrix implicitly represented as a WTA. We evaluate our method on real-world data originating in newswire treebank. We show that the model achieves lower perplexity than previous methods for PCFG minimization, and also is much more stable due to the absence of local optima.} }
Endnote
%0 Conference Paper %T Low-Rank Approximation of Weighted Tree Automata %A Guillaume Rabusseau %A Borja Balle %A Shay Cohen %B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2016 %E Arthur Gretton %E Christian C. Robert %F pmlr-v51-rabusseau16 %I PMLR %P 839--847 %U https://proceedings.mlr.press/v51/rabusseau16.html %V 51 %X We describe a technique to minimize weighted tree automata (WTA), a powerful formalisms that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs. Our method relies on a singular value decomposition of the underlying Hankel matrix defined by the WTA. Our main theoretical result is an efficient algorithm for computing the SVD of an infinite Hankel matrix implicitly represented as a WTA. We evaluate our method on real-world data originating in newswire treebank. We show that the model achieves lower perplexity than previous methods for PCFG minimization, and also is much more stable due to the absence of local optima.
RIS
TY - CPAPER TI - Low-Rank Approximation of Weighted Tree Automata AU - Guillaume Rabusseau AU - Borja Balle AU - Shay Cohen BT - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics DA - 2016/05/02 ED - Arthur Gretton ED - Christian C. Robert ID - pmlr-v51-rabusseau16 PB - PMLR DP - Proceedings of Machine Learning Research VL - 51 SP - 839 EP - 847 L1 - http://proceedings.mlr.press/v51/rabusseau16.pdf UR - https://proceedings.mlr.press/v51/rabusseau16.html AB - We describe a technique to minimize weighted tree automata (WTA), a powerful formalisms that subsumes probabilistic context-free grammars (PCFGs) and latent-variable PCFGs. Our method relies on a singular value decomposition of the underlying Hankel matrix defined by the WTA. Our main theoretical result is an efficient algorithm for computing the SVD of an infinite Hankel matrix implicitly represented as a WTA. We evaluate our method on real-world data originating in newswire treebank. We show that the model achieves lower perplexity than previous methods for PCFG minimization, and also is much more stable due to the absence of local optima. ER -
APA
Rabusseau, G., Balle, B. & Cohen, S.. (2016). Low-Rank Approximation of Weighted Tree Automata. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:839-847 Available from https://proceedings.mlr.press/v51/rabusseau16.html.

Related Material