Formal and Empirical Studies of Counting Behaviour in ReLU RNNs

Nadine El-Naggar, Andrew Ryzhikov, Laure Daviaud, Pranava Madhyastha, Tillman Weyde
Proceedings of 16th edition of the International Conference on Grammatical Inference, PMLR 217:199-222, 2023.

Abstract

In recent years, the discussion about systematicity of neural network learning has gained renewed interest, in particular the formal analysis of neural network behaviour. In this paper, we investigate the capability of single-cell ReLU RNN models to demonstrate precise counting behaviour. Formally, we start by characterising the semi-Dyck-1 language and semi-Dyck-1 counter machine that can be implemented by a single Rectified Linear Unit (ReLU) cell. We define three Counter Indicator Conditions (CICs) on the weights of a ReLU cell and show that fulfilling these conditions is equivalent to accepting the semi-Dyck-1 language, i.e. to perform exact counting. Empirically, we study the ability of single-cell ReLU RNNs to learn to count by training and testing them on different datasets of Dyck-1 and semi-Dyck-1 strings. While networks that satisfy the CICs count exactly and thus correctly even on very long strings, the trained networks exhibit a wide range of results and never satisfy the CICs exactly. We investigate the effect of deviating from the CICs and find that configurations that fulfil the CICs are not at a minimum of the loss function in the most common setups. This is consistent with observations in previous research indicating that training ReLU networks for counting tasks often leads to poor results. We finally discuss implications of these results and possible avenues for improving network behaviour.

Cite this Paper


BibTeX
@InProceedings{pmlr-v217-el-naggar23a, title = {Formal and Empirical Studies of Counting Behaviour in ReLU RNNs}, author = {El-Naggar, Nadine and Ryzhikov, Andrew and Daviaud, Laure and Madhyastha, Pranava and Weyde, Tillman}, booktitle = {Proceedings of 16th edition of the International Conference on Grammatical Inference}, pages = {199--222}, year = {2023}, editor = {Coste, Fran├žois and Ouardi, Faissal and Rabusseau, Guillaume}, volume = {217}, series = {Proceedings of Machine Learning Research}, month = {10--13 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v217/el-naggar23a/el-naggar23a.pdf}, url = {https://proceedings.mlr.press/v217/el-naggar23a.html}, abstract = {In recent years, the discussion about systematicity of neural network learning has gained renewed interest, in particular the formal analysis of neural network behaviour. In this paper, we investigate the capability of single-cell ReLU RNN models to demonstrate precise counting behaviour. Formally, we start by characterising the semi-Dyck-1 language and semi-Dyck-1 counter machine that can be implemented by a single Rectified Linear Unit (ReLU) cell. We define three Counter Indicator Conditions (CICs) on the weights of a ReLU cell and show that fulfilling these conditions is equivalent to accepting the semi-Dyck-1 language, i.e. to perform exact counting. Empirically, we study the ability of single-cell ReLU RNNs to learn to count by training and testing them on different datasets of Dyck-1 and semi-Dyck-1 strings. While networks that satisfy the CICs count exactly and thus correctly even on very long strings, the trained networks exhibit a wide range of results and never satisfy the CICs exactly. We investigate the effect of deviating from the CICs and find that configurations that fulfil the CICs are not at a minimum of the loss function in the most common setups. This is consistent with observations in previous research indicating that training ReLU networks for counting tasks often leads to poor results. We finally discuss implications of these results and possible avenues for improving network behaviour.} }
Endnote
%0 Conference Paper %T Formal and Empirical Studies of Counting Behaviour in ReLU RNNs %A Nadine El-Naggar %A Andrew Ryzhikov %A Laure Daviaud %A Pranava Madhyastha %A Tillman Weyde %B Proceedings of 16th edition of the International Conference on Grammatical Inference %C Proceedings of Machine Learning Research %D 2023 %E Fran├žois Coste %E Faissal Ouardi %E Guillaume Rabusseau %F pmlr-v217-el-naggar23a %I PMLR %P 199--222 %U https://proceedings.mlr.press/v217/el-naggar23a.html %V 217 %X In recent years, the discussion about systematicity of neural network learning has gained renewed interest, in particular the formal analysis of neural network behaviour. In this paper, we investigate the capability of single-cell ReLU RNN models to demonstrate precise counting behaviour. Formally, we start by characterising the semi-Dyck-1 language and semi-Dyck-1 counter machine that can be implemented by a single Rectified Linear Unit (ReLU) cell. We define three Counter Indicator Conditions (CICs) on the weights of a ReLU cell and show that fulfilling these conditions is equivalent to accepting the semi-Dyck-1 language, i.e. to perform exact counting. Empirically, we study the ability of single-cell ReLU RNNs to learn to count by training and testing them on different datasets of Dyck-1 and semi-Dyck-1 strings. While networks that satisfy the CICs count exactly and thus correctly even on very long strings, the trained networks exhibit a wide range of results and never satisfy the CICs exactly. We investigate the effect of deviating from the CICs and find that configurations that fulfil the CICs are not at a minimum of the loss function in the most common setups. This is consistent with observations in previous research indicating that training ReLU networks for counting tasks often leads to poor results. We finally discuss implications of these results and possible avenues for improving network behaviour.
APA
El-Naggar, N., Ryzhikov, A., Daviaud, L., Madhyastha, P. & Weyde, T.. (2023). Formal and Empirical Studies of Counting Behaviour in ReLU RNNs. Proceedings of 16th edition of the International Conference on Grammatical Inference, in Proceedings of Machine Learning Research 217:199-222 Available from https://proceedings.mlr.press/v217/el-naggar23a.html.

Related Material