Theoretical Compression Bounds for Wide Multilayer Perceptrons

Houssam El Cheairi, David Gamarnik, Rahul Mazumder
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:2200-2258, 2026.

Abstract

Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-el-cheairi26a, title = {Theoretical Compression Bounds for Wide Multilayer Perceptrons}, author = {El Cheairi, Houssam and Gamarnik, David and Mazumder, Rahul}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {2200--2258}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/el-cheairi26a/el-cheairi26a.pdf}, url = {https://proceedings.mlr.press/v336/el-cheairi26a.html}, abstract = {Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.} }
Endnote
%0 Conference Paper %T Theoretical Compression Bounds for Wide Multilayer Perceptrons %A Houssam El Cheairi %A David Gamarnik %A Rahul Mazumder %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-el-cheairi26a %I PMLR %P 2200--2258 %U https://proceedings.mlr.press/v336/el-cheairi26a.html %V 336 %X Pruning and quantization techniques have been broadly successful in reducing the number of parameters needed for large neural networks, yet theoretical justification for their empirical success falls short. We consider a randomized greedy compression algorithm for pruning and quantization post-training and use it to rigorously show the existence of pruned/quantized subnetworks of multilayer perceptrons (MLPs) with competitive performance. We further extend our results to structured pruning of MLPs and convolutional neural networks (CNNs), thus providing a unified analysis of pruning in wide networks. Our results are free of data assumptions, and showcase a tradeoff between compressibility and network width. The algorithm we consider bears some similarities with Optimal Brain Damage (OBD) and can be viewed as a post-training randomized version of it. The theoretical results we derive bridge the gap between theory and application for pruning/quantization, and provide a justification for the empirical success of compression in wide multilayer perceptrons.
APA
El Cheairi, H., Gamarnik, D. & Mazumder, R.. (2026). Theoretical Compression Bounds for Wide Multilayer Perceptrons. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:2200-2258 Available from https://proceedings.mlr.press/v336/el-cheairi26a.html.

Related Material