[edit]
Research on gesture recognition based on YOLOv8
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:189-196, 2025.
Abstract
Recognizing gestures quickly and accurately has always been a research topic that has attracted much attention. However, existing gesture recognition algorithms still face two challenges. The computational complexity and parameters of gesture recognition deep learning models are often numerous, making them difficult to deploy on resource-limited embedded devices. Secondly, the deep learning model for dynamic gesture recognition is still insufficient in its ability to extract location spatial features. To solve the above problems, this paper proposes a gesture recognition algorithm based on an attention mechanism. First, You Only Look Once (YOLO) v8n lightweight object detection algorithm was selected to reduce parameters and calculations. Furthermore, the Multi-Head Self-Attention (MHSA) model was integrated into the YOLOv8n network to enhance the feature extraction capabilities from the position and spatial dimensions. Experimental results demonstrated that the proposed algorithm achieved 99.2% accuracy, surpassing by 1.1% compared to the original algorithm. Furthermore, it had a 233 FPS detection speed on the Nvidia RTX 3070 GPU.