Formal languages and neural models for learning on sequences

William Merrill
Proceedings of 16th edition of the International Conference on Grammatical Inference, PMLR 217:5-5, 2023.

Abstract

The empirical success of deep learning in NLP and related fields motivates understanding the model of grammar implicit within neural networks on a theoretical level. In this tutorial, I will overview recent empirical and theoretical insights on the power of neural networks as formal language recognizers. We will cover the classical proof that infinite-precision RNNs are Turing-complete, formal analysis and experiments comparing the relative power of different finite-precision RNN architectures, and recent work characterizing transformers as language recognizers using circuits and logic. We may also cover applications of this work, including the extraction of discrete models from neural networks. Hopefully, the tutorial will synthesize different analysis frameworks and findings about neural networks into a coherent narrative, and provide a call to action for the ICGI community to engage with exciting open questions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v217-merrill23a, title = {Formal languages and neural models for learning on sequences}, author = {Merrill, William}, booktitle = {Proceedings of 16th edition of the International Conference on Grammatical Inference}, pages = {5--5}, year = {2023}, editor = {Coste, Fran├žois and Ouardi, Faissal and Rabusseau, Guillaume}, volume = {217}, series = {Proceedings of Machine Learning Research}, month = {10--13 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v217/merrill23a/merrill23a.pdf}, url = {https://proceedings.mlr.press/v217/merrill23a.html}, abstract = {The empirical success of deep learning in NLP and related fields motivates understanding the model of grammar implicit within neural networks on a theoretical level. In this tutorial, I will overview recent empirical and theoretical insights on the power of neural networks as formal language recognizers. We will cover the classical proof that infinite-precision RNNs are Turing-complete, formal analysis and experiments comparing the relative power of different finite-precision RNN architectures, and recent work characterizing transformers as language recognizers using circuits and logic. We may also cover applications of this work, including the extraction of discrete models from neural networks. Hopefully, the tutorial will synthesize different analysis frameworks and findings about neural networks into a coherent narrative, and provide a call to action for the ICGI community to engage with exciting open questions.} }
Endnote
%0 Conference Paper %T Formal languages and neural models for learning on sequences %A William Merrill %B Proceedings of 16th edition of the International Conference on Grammatical Inference %C Proceedings of Machine Learning Research %D 2023 %E Fran├žois Coste %E Faissal Ouardi %E Guillaume Rabusseau %F pmlr-v217-merrill23a %I PMLR %P 5--5 %U https://proceedings.mlr.press/v217/merrill23a.html %V 217 %X The empirical success of deep learning in NLP and related fields motivates understanding the model of grammar implicit within neural networks on a theoretical level. In this tutorial, I will overview recent empirical and theoretical insights on the power of neural networks as formal language recognizers. We will cover the classical proof that infinite-precision RNNs are Turing-complete, formal analysis and experiments comparing the relative power of different finite-precision RNN architectures, and recent work characterizing transformers as language recognizers using circuits and logic. We may also cover applications of this work, including the extraction of discrete models from neural networks. Hopefully, the tutorial will synthesize different analysis frameworks and findings about neural networks into a coherent narrative, and provide a call to action for the ICGI community to engage with exciting open questions.
APA
Merrill, W.. (2023). Formal languages and neural models for learning on sequences. Proceedings of 16th edition of the International Conference on Grammatical Inference, in Proceedings of Machine Learning Research 217:5-5 Available from https://proceedings.mlr.press/v217/merrill23a.html.

Related Material