Fusing Recalibrated Features and Depthwise Separable Convolution for the Mangrove Bird Sound Classification
Proceedings of The Eleventh Asian Conference on Machine Learning, PMLR 101:924-939, 2019.
The bird community in the mangrove areas is an important component of the mangrove wetlands ecosystem and an indicator species for the assessment of the environmental health status of mangrove wetlands. The classification of bird species by the sound of bird in the mangrove areas has the advantages of less interference to the environment and wide monitoring range. In this paper, we propose a novel method that combines the feature recalibration mechanism with depthwise separable convolution for the mangrove bird sound classification. In the proposed method, we introduce Xception network in which depthwise separable convolution with lower parameter number and computational cost than traditional convolution can be stacked in a residual manner, as the baseline network. And we fuse the feature recalibration mechanism into the depthwise separable convolution for actively learning the weights of the feature channels in the network layer, so that we can enhance the important features in bird sound signals to improve the performance of the classification. In the proposed method, firstly we extract three-channel log-mel features of the bird sound signals and we introduce the mixup method to augment the extracted features. Secondly, we construct the recalibrated feature maps including the different scales of information to get the classification results. To verify the effectiveness of the proposed method, we build a dataset with 9282 samples including 25 kinds of the mangrove birds such as Egretta alba, Parus major, Charadrius dubius, etc. habiting in the mangroves of Fangcheng Port of China, and execute the experiments on the built dataset. Furthermore, we also validate the adaptability of our proposed method on the dataset of TAU Urban Acoustic Scenes 2019, and achieve a better result.