[edit]
Dynamic Forward and Backward Sparse Training (DFBST): Accelerated Deep Learning through Completely Sparse Training Schedule
Proceedings of The 14th Asian Conference on Machine
Learning, PMLR 189:848-863, 2023.
Abstract
Neural network sparsification has received a lot of
attention in recent years. A number of dynamic
sparse training methods have been developed that
achieve significant sparsity levels during training,
ensuring comparable performance to their dense
counterparts. However, most of these methods update
all the model parameters using dense gradients. To
this end, gradient sparsification is achieved either
by non-dynamic (fixed) schedule or computationally
expensive dynamic pruning schedule. To alleviate
these drawbacks, we propose Dynamic Forward and
Backward Sparse Training (DFBST), an algorithm which
dynamically sparsifies both the forward and backward
passes using trainable masks, leading to a
completely sparse training schedule. In contrast to
existing sparse training methods, we propose
separate learning for forward as well as backward
masks. Our approach achieves state of the art
performance in terms of both accuracy and sparsity
compared to existing dynamic pruning algorithms on
benchmark datasets, namely MNIST, CIFAR-10 and
CIFAR-100.