On the Predictability of Pruning Across Scales

Jonathan S Rosenfeld; Jonathan Frankle; Michael Carbin; Nir Shavit

On the Predictability of Pruning Across Scales

Jonathan S Rosenfeld, Jonathan Frankle, Michael Carbin, Nir Shavit

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:9075-9083, 2021.

Abstract

We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task. We functionally approximate the error of the pruned networks, showing it is predictable in terms of an invariant tying width, depth, and pruning level, such that networks of vastly different pruned densities are interchangeable. We demonstrate the accuracy of this approximation over orders of magnitude in depth, width, dataset size, and density. We show that the functional form holds (generalizes) for large scale data (e.g., ImageNet) and architectures (e.g., ResNets). As neural networks become ever larger and costlier to train, our findings suggest a framework for reasoning conceptually and analytically about a standard method for unstructured pruning.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-rosenfeld21a,
  title = 	 {On the Predictability of Pruning Across Scales},
  author =       {Rosenfeld, Jonathan S and Frankle, Jonathan and Carbin, Michael and Shavit, Nir},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {9075--9083},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/rosenfeld21a/rosenfeld21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/rosenfeld21a.html},
  abstract = 	 {We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task. We functionally approximate the error of the pruned networks, showing it is predictable in terms of an invariant tying width, depth, and pruning level, such that networks of vastly different pruned densities are interchangeable. We demonstrate the accuracy of this approximation over orders of magnitude in depth, width, dataset size, and density. We show that the functional form holds (generalizes) for large scale data (e.g., ImageNet) and architectures (e.g., ResNets). As neural networks become ever larger and costlier to train, our findings suggest a framework for reasoning conceptually and analytically about a standard method for unstructured pruning.}
}

Endnote

%0 Conference Paper
%T On the Predictability of Pruning Across Scales
%A Jonathan S Rosenfeld
%A Jonathan Frankle
%A Michael Carbin
%A Nir Shavit
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-rosenfeld21a
%I PMLR
%P 9075--9083
%U https://proceedings.mlr.press/v139/rosenfeld21a.html
%V 139
%X We show that the error of iteratively magnitude-pruned networks empirically follows a scaling law with interpretable coefficients that depend on the architecture and task. We functionally approximate the error of the pruned networks, showing it is predictable in terms of an invariant tying width, depth, and pruning level, such that networks of vastly different pruned densities are interchangeable. We demonstrate the accuracy of this approximation over orders of magnitude in depth, width, dataset size, and density. We show that the functional form holds (generalizes) for large scale data (e.g., ImageNet) and architectures (e.g., ResNets). As neural networks become ever larger and costlier to train, our findings suggest a framework for reasoning conceptually and analytically about a standard method for unstructured pruning.

APA

Rosenfeld, J.S., Frankle, J., Carbin, M. & Shavit, N.. (2021). On the Predictability of Pruning Across Scales. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:9075-9083 Available from https://proceedings.mlr.press/v139/rosenfeld21a.html.

On the Predictability of Pruning Across Scales

Abstract

Cite this Paper

Related Material