HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training

Haoyu Ma, Chengming Zhang, lizhi xiang, Xiaolong Ma, Geng Yuan, Wenkai Zhang, Shiwei Liu, Tianlong Chen, Dingwen Tao, Yanzhi Wang, Zhangyang Wang, Xiaohui Xie
Conference on Parsimony and Learning, PMLR 234:282-301, 2024.

Abstract

Pruning at initialization and training a sparse network from scratch (sparse training) become increasingly popular. However, most sparse training literature addresses only the unstructured sparsity, which in practice brings little benefit to the training acceleration on GPU due to the irregularity of non-zero weights. In this paper, we work on sparse training with fine-grained structured sparsity, by extracting a few dense blocks from unstructured sparse weights. For Convolutional Neural networks (CNN), however, the extracted dense blocks will be broken in backpropagation due to the shape transformation of convolution filters implemented by GEMM. Thus, previous block-wise pruning methods can only be used to accelerate the forward pass of sparse CNN training. To this end, we propose Hardware-friendly Regrouping towards Block-based Pruning (HRBP), where the grouping is conducted on the kernel-wise mask. With HRBP, extracted dense blocks are preserved in backpropagation. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet demonstrate that HRBP can almost match the accuracy of unstructured sparse training methods while achieving a huge acceleration on hardware. Code is available at: https://github.com/HowieMa/HRBP-pruning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v234-ma24a, title = {HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training}, author = {Ma, Haoyu and Zhang, Chengming and xiang, lizhi and Ma, Xiaolong and Yuan, Geng and Zhang, Wenkai and Liu, Shiwei and Chen, Tianlong and Tao, Dingwen and Wang, Yanzhi and Wang, Zhangyang and Xie, Xiaohui}, booktitle = {Conference on Parsimony and Learning}, pages = {282--301}, year = {2024}, editor = {Chi, Yuejie and Dziugaite, Gintare Karolina and Qu, Qing and Wang, Atlas Wang and Zhu, Zhihui}, volume = {234}, series = {Proceedings of Machine Learning Research}, month = {03--06 Jan}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v234/ma24a/ma24a.pdf}, url = {https://proceedings.mlr.press/v234/ma24a.html}, abstract = {Pruning at initialization and training a sparse network from scratch (sparse training) become increasingly popular. However, most sparse training literature addresses only the unstructured sparsity, which in practice brings little benefit to the training acceleration on GPU due to the irregularity of non-zero weights. In this paper, we work on sparse training with fine-grained structured sparsity, by extracting a few dense blocks from unstructured sparse weights. For Convolutional Neural networks (CNN), however, the extracted dense blocks will be broken in backpropagation due to the shape transformation of convolution filters implemented by GEMM. Thus, previous block-wise pruning methods can only be used to accelerate the forward pass of sparse CNN training. To this end, we propose Hardware-friendly Regrouping towards Block-based Pruning (HRBP), where the grouping is conducted on the kernel-wise mask. With HRBP, extracted dense blocks are preserved in backpropagation. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet demonstrate that HRBP can almost match the accuracy of unstructured sparse training methods while achieving a huge acceleration on hardware. Code is available at: https://github.com/HowieMa/HRBP-pruning.} }
Endnote
%0 Conference Paper %T HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training %A Haoyu Ma %A Chengming Zhang %A lizhi xiang %A Xiaolong Ma %A Geng Yuan %A Wenkai Zhang %A Shiwei Liu %A Tianlong Chen %A Dingwen Tao %A Yanzhi Wang %A Zhangyang Wang %A Xiaohui Xie %B Conference on Parsimony and Learning %C Proceedings of Machine Learning Research %D 2024 %E Yuejie Chi %E Gintare Karolina Dziugaite %E Qing Qu %E Atlas Wang Wang %E Zhihui Zhu %F pmlr-v234-ma24a %I PMLR %P 282--301 %U https://proceedings.mlr.press/v234/ma24a.html %V 234 %X Pruning at initialization and training a sparse network from scratch (sparse training) become increasingly popular. However, most sparse training literature addresses only the unstructured sparsity, which in practice brings little benefit to the training acceleration on GPU due to the irregularity of non-zero weights. In this paper, we work on sparse training with fine-grained structured sparsity, by extracting a few dense blocks from unstructured sparse weights. For Convolutional Neural networks (CNN), however, the extracted dense blocks will be broken in backpropagation due to the shape transformation of convolution filters implemented by GEMM. Thus, previous block-wise pruning methods can only be used to accelerate the forward pass of sparse CNN training. To this end, we propose Hardware-friendly Regrouping towards Block-based Pruning (HRBP), where the grouping is conducted on the kernel-wise mask. With HRBP, extracted dense blocks are preserved in backpropagation. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet demonstrate that HRBP can almost match the accuracy of unstructured sparse training methods while achieving a huge acceleration on hardware. Code is available at: https://github.com/HowieMa/HRBP-pruning.
APA
Ma, H., Zhang, C., xiang, l., Ma, X., Yuan, G., Zhang, W., Liu, S., Chen, T., Tao, D., Wang, Y., Wang, Z. & Xie, X.. (2024). HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training. Conference on Parsimony and Learning, in Proceedings of Machine Learning Research 234:282-301 Available from https://proceedings.mlr.press/v234/ma24a.html.

Related Material