AARM: Action Attention Recalibration Module for Action Recognition

Li Zhonghong; Yi Yang; She Ying; Song Jialun; Wu Yukun

AARM: Action Attention Recalibration Module for Action Recognition

Li Zhonghong, Yi Yang, She Ying, Song Jialun, Wu Yukun

Proceedings of The 12th Asian Conference on Machine Learning, PMLR 129:97-112, 2020.

Abstract

Most of Action recognition methods deploy networks pretrained on image datasets, and a common limitation is that these networks hardly capture salient features of the video clip due to their training strategies. To address this issue, we propose Action Attention Recalibration Module (AARM), a lightweight but effective module which introduces the attention mechanism to process feature maps of the network. The proposed module is composed of two novel components: 1) convolutional attention submodule that obtains inter-channel attention maps and spatial-temporal attention maps during the convolutional stage, and 2) activation attention submodule that highlights the significant activations in the fully connected process. Based on ablation studies and extensive experiments, we demonstrate that AARM enables networks to be sensitive on informative parts and gain accuracy increasements, achieving the state-of-the-art performance on UCF101 and HMDB51.

Cite this Paper

BibTeX

@InProceedings{pmlr-v129-zhonghong20a,
  title = 	 {AARM: Action Attention Recalibration Module for Action Recognition},
  author =       {Zhonghong, Li and Yang, Yi and Ying, She and Jialun, Song and Yukun, Wu},
  booktitle = 	 {Proceedings of The 12th Asian Conference on Machine Learning},
  pages = 	 {97--112},
  year = 	 {2020},
  editor = 	 {Pan, Sinno Jialin and Sugiyama, Masashi},
  volume = 	 {129},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--20 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v129/zhonghong20a/zhonghong20a.pdf},
  url = 	 {https://proceedings.mlr.press/v129/zhonghong20a.html},
  abstract = 	 {Most of Action recognition methods deploy networks pretrained on image datasets, and a common limitation is that these networks hardly capture salient features of the video clip due to their training strategies. To address this issue, we propose Action Attention Recalibration Module (AARM), a lightweight but effective module which introduces the attention mechanism to process feature maps of the network. The proposed module is composed of two novel components: 1) convolutional attention submodule that obtains inter-channel attention maps and spatial-temporal attention maps during the convolutional stage, and 2) activation attention submodule that highlights the significant activations in the fully connected process. Based on ablation studies and extensive experiments, we demonstrate that AARM enables networks to be sensitive on informative parts and gain accuracy increasements, achieving the state-of-the-art performance on UCF101 and HMDB51.}
}

Endnote

%0 Conference Paper
%T AARM: Action Attention Recalibration Module for Action Recognition
%A Li Zhonghong
%A Yi Yang
%A She Ying
%A Song Jialun
%A Wu Yukun
%B Proceedings of The 12th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Sinno Jialin Pan
%E Masashi Sugiyama	
%F pmlr-v129-zhonghong20a
%I PMLR
%P 97--112
%U https://proceedings.mlr.press/v129/zhonghong20a.html
%V 129
%X Most of Action recognition methods deploy networks pretrained on image datasets, and a common limitation is that these networks hardly capture salient features of the video clip due to their training strategies. To address this issue, we propose Action Attention Recalibration Module (AARM), a lightweight but effective module which introduces the attention mechanism to process feature maps of the network. The proposed module is composed of two novel components: 1) convolutional attention submodule that obtains inter-channel attention maps and spatial-temporal attention maps during the convolutional stage, and 2) activation attention submodule that highlights the significant activations in the fully connected process. Based on ablation studies and extensive experiments, we demonstrate that AARM enables networks to be sensitive on informative parts and gain accuracy increasements, achieving the state-of-the-art performance on UCF101 and HMDB51.

APA

Zhonghong, L., Yang, Y., Ying, S., Jialun, S. & Yukun, W.. (2020). AARM: Action Attention Recalibration Module for Action Recognition. Proceedings of The 12th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 129:97-112 Available from https://proceedings.mlr.press/v129/zhonghong20a.html.

Related Material

Download PDF