Better depth-width trade-offs for neural networks through the lens of dynamical systems

Vaggos Chatziafratis, Sai Ganesh Nagarajan, Ioannis Panageas
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1469-1478, 2020.

Abstract

The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map f, called periodic points. In this work, we strengthen the connection with dynamical systems and we improve the existing width lower bounds along several aspects. Our first main result is period-specific width lower bounds that hold under the stronger notion of L1-approximation error, instead of the weaker classification error. Our second contribution is that we provide sharper width lower bounds, still yielding meaningful exponential depth-width separations, in regimes where previous results wouldn’t apply. A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs, as long as f has odd periods. Technically, our results follow by unveiling a tighter connection between the following three quantities of a given function: its period, its Lipschitz constant and the growth rate of the number of oscillations arising under compositions of the function f with itself.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chatziafratis20a, title = {Better depth-width trade-offs for neural networks through the lens of dynamical systems}, author = {Chatziafratis, Vaggos and Nagarajan, Sai Ganesh and Panageas, Ioannis}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1469--1478}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chatziafratis20a/chatziafratis20a.pdf}, url = {https://proceedings.mlr.press/v119/chatziafratis20a.html}, abstract = {The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map $f$, called periodic points. In this work, we strengthen the connection with dynamical systems and we improve the existing width lower bounds along several aspects. Our first main result is period-specific width lower bounds that hold under the stronger notion of $L^1$-approximation error, instead of the weaker classification error. Our second contribution is that we provide sharper width lower bounds, still yielding meaningful exponential depth-width separations, in regimes where previous results wouldn’t apply. A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs, as long as $f$ has odd periods. Technically, our results follow by unveiling a tighter connection between the following three quantities of a given function: its period, its Lipschitz constant and the growth rate of the number of oscillations arising under compositions of the function $f$ with itself.} }
Endnote
%0 Conference Paper %T Better depth-width trade-offs for neural networks through the lens of dynamical systems %A Vaggos Chatziafratis %A Sai Ganesh Nagarajan %A Ioannis Panageas %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chatziafratis20a %I PMLR %P 1469--1478 %U https://proceedings.mlr.press/v119/chatziafratis20a.html %V 119 %X The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map $f$, called periodic points. In this work, we strengthen the connection with dynamical systems and we improve the existing width lower bounds along several aspects. Our first main result is period-specific width lower bounds that hold under the stronger notion of $L^1$-approximation error, instead of the weaker classification error. Our second contribution is that we provide sharper width lower bounds, still yielding meaningful exponential depth-width separations, in regimes where previous results wouldn’t apply. A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs, as long as $f$ has odd periods. Technically, our results follow by unveiling a tighter connection between the following three quantities of a given function: its period, its Lipschitz constant and the growth rate of the number of oscillations arising under compositions of the function $f$ with itself.
APA
Chatziafratis, V., Nagarajan, S.G. & Panageas, I.. (2020). Better depth-width trade-offs for neural networks through the lens of dynamical systems. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1469-1478 Available from https://proceedings.mlr.press/v119/chatziafratis20a.html.

Related Material