The Complexity of Explaining Neural Networks Through (group) Invariants


Danielle Ensign, Scott Neville, Arnab Paul, Suresh Venkatasubramanian ;
Proceedings of the 28th International Conference on Algorithmic Learning Theory, PMLR 76:341-359, 2017.


Ever since the work of Minsky and Papert, it has been thought that neural networks derive their effectiveness by finding representations of the data that are invariant with respect to the task. In other words, the representations eliminate components of the data that vary in a way that is irrelevant. These invariants are naturally expressed with respect to group operations, and thus an understanding of these groups is key to explaining the effectiveness of the neural network. Moreover, a line of work in deep learning has shown that explicit knowledge of group invariants can lead to more effective training results.

In this paper, we investigate the difficulty of discovering anything about these implicit invariants. Unfortunately, our main results are negative: we show that a variety of questions around investigating invariant representations are NP-hard, even in approximate settings. Moreover, these results do not depend on the kind of architecture used: in fact, our results follow as soon as the network architecture is powerful enough to be universal. The key idea behind our results is that if we can find the symmetries of a problem then we can solve it.

Related Material