An Action Recognition Approach with Context and Multiscale Motion Awareness

  • Danilo Barros Cardoso UFMG
  • Luiza C. B. Campos UFMG
  • Erickson R. Nascimento UFMG

Resumo


Despite the substantial progress made by computer vision approaches in solving image classification, object detection, and pose estimation, to name a few, activity recognition remains one of the key challenges in computer vision and pattern recognition. This paper proposes a new learning framework based on multiscale spatiotemporal graph convolution layers and a transformer architecture. Even though several approaches present high accuracy in more traditional datasets like NTU, their performance significantly drops when tested in datasets with a high level of ambiguity among activities and an unbalanced number of samples for each class. We evaluated our architecture in the challenging BABEL dataset, where we achieved state of the art in terms of accuracy (65.4%) in action classification when considering both ambiguity and class unbalance. The source code and trained models are publicly available at https://github.com/verlab/AnActionRecognitionApproach_SIBGRAPI_2022.

Palavras-chave: Measurement, Computer vision, Source coding, Semantics, Pose estimation, Computer architecture, Object detection
Publicado
24/10/2022
CARDOSO, Danilo Barros; CAMPOS, Luiza C. B.; NASCIMENTO, Erickson R.. An Action Recognition Approach with Context and Multiscale Motion Awareness. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 35. , 2022, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .