Iterative Deep Model Compression and Acceleration in the Frequency Domain

Yao Zeng, Xusheng Liu, Lintan Sun, Wenzhong Li, Yuchu Fang, Sanglu Lu
Proceedings of The 13th Asian Conference on Machine Learning, PMLR 157:331-346, 2021.

Abstract

Deep Convolutional Neural Networks (CNNs) are successfully applied in many complex tasks, but their storage and huge computational costs hinder their deployment on edge devices. CNN model compression techniques have been widely studied in the past five years, most of which are conducted in the spatial domain. Inspired by the sparsity and low-rank properties of weight matrices in the frequency domain, we propose a novel frequency pruning framework for model compression and acceleration while maintaining high-performance. We firstly apply Discrete Cosine Transform (DCT) on convolutional kernels and train them in the frequency domain to get sparse representations. Then we propose an iterative model compression method to decompose the frequency matrices with a sampled-based low-rank approximation algorithm, and then fine-tune and recompose the low-rank matrices gradually until a predefined compression ratio is reached. We further demonstrate that model inference can be conducted with the decomposed frequency matrices, where model parameters and inference cost can be significantly reduced. Extensive experiments using well-known CNN models based on three open datasets show that the proposed method outperforms the state-of-the-arts in reduction of both the number of parameters and floating-point operations (FLOPs) without sacrificing too much model accuracy.

Cite this Paper


BibTeX
@InProceedings{pmlr-v157-zeng21a, title = {Iterative Deep Model Compression and Acceleration in the Frequency Domain}, author = {Zeng, Yao and Liu, Xusheng and Sun, Lintan and Li, Wenzhong and Fang, Yuchu and Lu, Sanglu}, booktitle = {Proceedings of The 13th Asian Conference on Machine Learning}, pages = {331--346}, year = {2021}, editor = {Balasubramanian, Vineeth N. and Tsang, Ivor}, volume = {157}, series = {Proceedings of Machine Learning Research}, month = {17--19 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v157/zeng21a/zeng21a.pdf}, url = {https://proceedings.mlr.press/v157/zeng21a.html}, abstract = {Deep Convolutional Neural Networks (CNNs) are successfully applied in many complex tasks, but their storage and huge computational costs hinder their deployment on edge devices. CNN model compression techniques have been widely studied in the past five years, most of which are conducted in the spatial domain. Inspired by the sparsity and low-rank properties of weight matrices in the frequency domain, we propose a novel frequency pruning framework for model compression and acceleration while maintaining high-performance. We firstly apply Discrete Cosine Transform (DCT) on convolutional kernels and train them in the frequency domain to get sparse representations. Then we propose an iterative model compression method to decompose the frequency matrices with a sampled-based low-rank approximation algorithm, and then fine-tune and recompose the low-rank matrices gradually until a predefined compression ratio is reached. We further demonstrate that model inference can be conducted with the decomposed frequency matrices, where model parameters and inference cost can be significantly reduced. Extensive experiments using well-known CNN models based on three open datasets show that the proposed method outperforms the state-of-the-arts in reduction of both the number of parameters and floating-point operations (FLOPs) without sacrificing too much model accuracy.} }
Endnote
%0 Conference Paper %T Iterative Deep Model Compression and Acceleration in the Frequency Domain %A Yao Zeng %A Xusheng Liu %A Lintan Sun %A Wenzhong Li %A Yuchu Fang %A Sanglu Lu %B Proceedings of The 13th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Vineeth N. Balasubramanian %E Ivor Tsang %F pmlr-v157-zeng21a %I PMLR %P 331--346 %U https://proceedings.mlr.press/v157/zeng21a.html %V 157 %X Deep Convolutional Neural Networks (CNNs) are successfully applied in many complex tasks, but their storage and huge computational costs hinder their deployment on edge devices. CNN model compression techniques have been widely studied in the past five years, most of which are conducted in the spatial domain. Inspired by the sparsity and low-rank properties of weight matrices in the frequency domain, we propose a novel frequency pruning framework for model compression and acceleration while maintaining high-performance. We firstly apply Discrete Cosine Transform (DCT) on convolutional kernels and train them in the frequency domain to get sparse representations. Then we propose an iterative model compression method to decompose the frequency matrices with a sampled-based low-rank approximation algorithm, and then fine-tune and recompose the low-rank matrices gradually until a predefined compression ratio is reached. We further demonstrate that model inference can be conducted with the decomposed frequency matrices, where model parameters and inference cost can be significantly reduced. Extensive experiments using well-known CNN models based on three open datasets show that the proposed method outperforms the state-of-the-arts in reduction of both the number of parameters and floating-point operations (FLOPs) without sacrificing too much model accuracy.
APA
Zeng, Y., Liu, X., Sun, L., Li, W., Fang, Y. & Lu, S.. (2021). Iterative Deep Model Compression and Acceleration in the Frequency Domain. Proceedings of The 13th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 157:331-346 Available from https://proceedings.mlr.press/v157/zeng21a.html.

Related Material