Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

Dongqi Cai; Yangyuxuan Kang; Anbang Yao; Yurong Chen

Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

Dongqi Cai, Yangyuxuan Kang, Anbang Yao, Yurong Chen

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:3431-3441, 2023.

Abstract

This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-cai23c,
  title = 	 {{S}ke2{G}rid: Skeleton-to-Grid Representation Learning for Action Recognition},
  author =       {Cai, Dongqi and Kang, Yangyuxuan and Yao, Anbang and Chen, Yurong},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {3431--3441},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/cai23c/cai23c.pdf},
  url = 	 {https://proceedings.mlr.press/v202/cai23c.html},
  abstract = 	 {This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.}
}

Endnote

%0 Conference Paper
%T Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition
%A Dongqi Cai
%A Yangyuxuan Kang
%A Anbang Yao
%A Yurong Chen
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-cai23c
%I PMLR
%P 3431--3441
%U https://proceedings.mlr.press/v202/cai23c.html
%V 202
%X This paper presents Ske2Grid, a new representation learning framework for improved skeleton-based action recognition. In Ske2Grid, we define a regular convolution operation upon a novel grid representation of human skeleton, which is a compact image-like grid patch constructed and learned through three novel designs. Specifically, we propose a graph-node index transform (GIT) to construct a regular grid patch through assigning the nodes in the skeleton graph one by one to the desired grid cells. To ensure that GIT is a bijection and enrich the expressiveness of the grid representation, an up-sampling transform (UPT) is learned to interpolate the skeleton graph nodes for filling the grid patch to the full. To resolve the problem when the one-step UPT is aggressive and further exploit the representation capability of the grid patch with increasing spatial size, a progressive learning strategy (PLS) is proposed which decouples the UPT into multiple steps and aligns them to multiple paired GITs through a compact cascaded design learned progressively. We construct networks upon prevailing graph convolution networks and conduct experiments on six mainstream skeleton-based action recognition datasets. Experiments show that our Ske2Grid significantly outperforms existing GCN-based solutions under different benchmark settings, without bells and whistles. Code and models are available at https://github.com/OSVAI/Ske2Grid.

APA


Cai, D., Kang, Y., Yao, A. & Chen, Y.. (2023). Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:3431-3441 Available from https://proceedings.mlr.press/v202/cai23c.html.

Ske2Grid: Skeleton-to-Grid Representation Learning for Action Recognition

Abstract

Cite this Paper

Related Material