Leveraging Sparse Input and Sparse Models: Efficient Distributed Learning in Resource-Constrained Environments

Emmanouil Kariotakis, Grigorios Tsagkatakis, Panagiotis Tsakalides, Anastasios Kyrillidis
Conference on Parsimony and Learning, PMLR 234:554-569, 2024.

Abstract

Optimizing for reduced computational and bandwidth resources enables model training in less-than-ideal environments and paves the way for practical and accessible AI solutions. This work is about the study and design of a system that exploits sparsity in the input layer and intermediate layers of a neural network. Further, the system gets trained and operates in a distributed manner. Focusing on image classification tasks, our system efficiently utilizes reduced portions of the input image data. By exploiting transfer learning techniques, it employs a pre-trained feature extractor, with the encoded representations being subsequently introduced into selected subnets of the system’s final classification module, adopting the Independent Subnetwork Training (IST) algorithm. This way, the input and subsequent feedforward layers are trained via sparse “actions”, where input and intermediate features are subsampled and propagated in the forward layers. We conduct experiments on several benchmark datasets, including CIFAR-$10$, NWPU-RESISC$45$, and the Aerial Image dataset. The results consistently showcase appealing accuracy despite sparsity: it is surprising that, empirically, there are cases where fixed masks could potentially outperform random masks and that the model achieves comparable or even superior accuracy with only a fraction ($50%$ or less) of the original image, making it particularly relevant in bandwidth-constrained scenarios. This further highlights the robustness of learned features extracted by ViT, offering the potential for parsimonious image data representation with sparse models in distributed learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v234-kariotakis24a, title = {Leveraging Sparse Input and Sparse Models: Efficient Distributed Learning in Resource-Constrained Environments}, author = {Kariotakis, Emmanouil and Tsagkatakis, Grigorios and Tsakalides, Panagiotis and Kyrillidis, Anastasios}, booktitle = {Conference on Parsimony and Learning}, pages = {554--569}, year = {2024}, editor = {Chi, Yuejie and Dziugaite, Gintare Karolina and Qu, Qing and Wang, Atlas Wang and Zhu, Zhihui}, volume = {234}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v234/kariotakis24a/kariotakis24a.pdf}, url = {https://proceedings.mlr.press/v234/kariotakis24a.html}, abstract = {Optimizing for reduced computational and bandwidth resources enables model training in less-than-ideal environments and paves the way for practical and accessible AI solutions. This work is about the study and design of a system that exploits sparsity in the input layer and intermediate layers of a neural network. Further, the system gets trained and operates in a distributed manner. Focusing on image classification tasks, our system efficiently utilizes reduced portions of the input image data. By exploiting transfer learning techniques, it employs a pre-trained feature extractor, with the encoded representations being subsequently introduced into selected subnets of the system’s final classification module, adopting the Independent Subnetwork Training (IST) algorithm. This way, the input and subsequent feedforward layers are trained via sparse “actions”, where input and intermediate features are subsampled and propagated in the forward layers. We conduct experiments on several benchmark datasets, including CIFAR-$10$, NWPU-RESISC$45$, and the Aerial Image dataset. The results consistently showcase appealing accuracy despite sparsity: it is surprising that, empirically, there are cases where fixed masks could potentially outperform random masks and that the model achieves comparable or even superior accuracy with only a fraction ($50%$ or less) of the original image, making it particularly relevant in bandwidth-constrained scenarios. This further highlights the robustness of learned features extracted by ViT, offering the potential for parsimonious image data representation with sparse models in distributed learning.} }
Endnote
%0 Conference Paper %T Leveraging Sparse Input and Sparse Models: Efficient Distributed Learning in Resource-Constrained Environments %A Emmanouil Kariotakis %A Grigorios Tsagkatakis %A Panagiotis Tsakalides %A Anastasios Kyrillidis %B Conference on Parsimony and Learning %C Proceedings of Machine Learning Research %D 2024 %E Yuejie Chi %E Gintare Karolina Dziugaite %E Qing Qu %E Atlas Wang Wang %E Zhihui Zhu %F pmlr-v234-kariotakis24a %I PMLR %P 554--569 %U https://proceedings.mlr.press/v234/kariotakis24a.html %V 234 %X Optimizing for reduced computational and bandwidth resources enables model training in less-than-ideal environments and paves the way for practical and accessible AI solutions. This work is about the study and design of a system that exploits sparsity in the input layer and intermediate layers of a neural network. Further, the system gets trained and operates in a distributed manner. Focusing on image classification tasks, our system efficiently utilizes reduced portions of the input image data. By exploiting transfer learning techniques, it employs a pre-trained feature extractor, with the encoded representations being subsequently introduced into selected subnets of the system’s final classification module, adopting the Independent Subnetwork Training (IST) algorithm. This way, the input and subsequent feedforward layers are trained via sparse “actions”, where input and intermediate features are subsampled and propagated in the forward layers. We conduct experiments on several benchmark datasets, including CIFAR-$10$, NWPU-RESISC$45$, and the Aerial Image dataset. The results consistently showcase appealing accuracy despite sparsity: it is surprising that, empirically, there are cases where fixed masks could potentially outperform random masks and that the model achieves comparable or even superior accuracy with only a fraction ($50%$ or less) of the original image, making it particularly relevant in bandwidth-constrained scenarios. This further highlights the robustness of learned features extracted by ViT, offering the potential for parsimonious image data representation with sparse models in distributed learning.
APA
Kariotakis, E., Tsagkatakis, G., Tsakalides, P. & Kyrillidis, A.. (2024). Leveraging Sparse Input and Sparse Models: Efficient Distributed Learning in Resource-Constrained Environments. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 234:554-569 Available from https://proceedings.mlr.press/v234/kariotakis24a.html.

Related Material