Large Neural Networks at a Fraction

Aritra Mukhopadhyay, Adhilsha A, Subhankar Mishra
Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}), PMLR 233:165-173, 2024.

Abstract

Large-scale deep learning models are known for their large amount of parameters, weighing down on the computational resources. The core of the Lottery Ticket Hypothesis showed us the potential of pruning to reduce such parameters without a significant drop in accuracy. Quaternion neural networks achieve comparable accuracy to equivalent real-valued networks on multi-dimensional prediction tasks. In our work, we implement pruning on real and quaternion-valued implementations of large-scale networks in the task of image recognition. For instance, our implementation of the ResNet-101 architecture on the CIFAR-100 and ImageNet64x64 datasets resulted in pruned quaternion models outperforming their real-valued counterparts by 4% and 7% in accuracy at sparsities of about 6% and 0.4%, respectively. We also got quaternion implementations of ResNet-101 and ResNet-152 on CIFAR-100 with steady Lottery tickets, whereas the Real counterpart failed to train at the same sparsity. Our experiments show that the pruned quaternion implementations perform better at higher sparsity than the corresponding real-valued counterpart, even in some larger neural networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v233-mukhopadhyay24a, title = {Large Neural Networks at a Fraction}, author = {Mukhopadhyay, Aritra and A, Adhilsha and Mishra, Subhankar}, booktitle = {Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL})}, pages = {165--173}, year = {2024}, editor = {Lutchyn, Tetiana and Ramírez Rivera, Adín and Ricaud, Benjamin}, volume = {233}, series = {Proceedings of Machine Learning Research}, month = {09--11 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v233/mukhopadhyay24a/mukhopadhyay24a.pdf}, url = {https://proceedings.mlr.press/v233/mukhopadhyay24a.html}, abstract = {Large-scale deep learning models are known for their large amount of parameters, weighing down on the computational resources. The core of the Lottery Ticket Hypothesis showed us the potential of pruning to reduce such parameters without a significant drop in accuracy. Quaternion neural networks achieve comparable accuracy to equivalent real-valued networks on multi-dimensional prediction tasks. In our work, we implement pruning on real and quaternion-valued implementations of large-scale networks in the task of image recognition. For instance, our implementation of the ResNet-101 architecture on the CIFAR-100 and ImageNet64x64 datasets resulted in pruned quaternion models outperforming their real-valued counterparts by 4% and 7% in accuracy at sparsities of about 6% and 0.4%, respectively. We also got quaternion implementations of ResNet-101 and ResNet-152 on CIFAR-100 with steady Lottery tickets, whereas the Real counterpart failed to train at the same sparsity. Our experiments show that the pruned quaternion implementations perform better at higher sparsity than the corresponding real-valued counterpart, even in some larger neural networks.} }
Endnote
%0 Conference Paper %T Large Neural Networks at a Fraction %A Aritra Mukhopadhyay %A Adhilsha A %A Subhankar Mishra %B Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}) %C Proceedings of Machine Learning Research %D 2024 %E Tetiana Lutchyn %E Adín Ramírez Rivera %E Benjamin Ricaud %F pmlr-v233-mukhopadhyay24a %I PMLR %P 165--173 %U https://proceedings.mlr.press/v233/mukhopadhyay24a.html %V 233 %X Large-scale deep learning models are known for their large amount of parameters, weighing down on the computational resources. The core of the Lottery Ticket Hypothesis showed us the potential of pruning to reduce such parameters without a significant drop in accuracy. Quaternion neural networks achieve comparable accuracy to equivalent real-valued networks on multi-dimensional prediction tasks. In our work, we implement pruning on real and quaternion-valued implementations of large-scale networks in the task of image recognition. For instance, our implementation of the ResNet-101 architecture on the CIFAR-100 and ImageNet64x64 datasets resulted in pruned quaternion models outperforming their real-valued counterparts by 4% and 7% in accuracy at sparsities of about 6% and 0.4%, respectively. We also got quaternion implementations of ResNet-101 and ResNet-152 on CIFAR-100 with steady Lottery tickets, whereas the Real counterpart failed to train at the same sparsity. Our experiments show that the pruned quaternion implementations perform better at higher sparsity than the corresponding real-valued counterpart, even in some larger neural networks.
APA
Mukhopadhyay, A., A, A. & Mishra, S.. (2024). Large Neural Networks at a Fraction. Proceedings of the 5th Northern Lights Deep Learning Conference ({NLDL}), in Proceedings of Machine Learning Research 233:165-173 Available from https://proceedings.mlr.press/v233/mukhopadhyay24a.html.

Related Material