A lightweight 2D Pose Machine with attention enhancement

  • Luiz Schirmer PUC-Rio
  • Djalma Lúcio IMPA
  • Alberto Raposo PUC-Rio
  • Luiz Velho IMPA
  • Hélio Lopes PUC-Rio

Resumo


Pose estimation is a challenging task in computer vision that has many applications, as for example: in motion capture, in medical analysis, in human posture monitoring, and in robotics. In other words, it is a main tool to enable machines do understand human patterns in videos or images. Performing this task in real-time while maintaining accuracy and precision is critical for many of these applications. Several papers propose real time approaches considering deep neural networks for pose estimation. However, in most cases they fail when considering run-time performance or do not achieve the precision needed. In this work, we propose a new model for real-time pose estimation considering attention modules for convolutional neural networks (CNNs). We introduce a two-dimensional relative attention mechanism for feature extraction in pose machines leading to improvements in accuracy. We create a single shot architecture where both operations to infer keypoints and part affinity fields share the information. Also, for each stage, we use tensor decompositions to not only reduce dimensionality, but also to improve performance. This allows us to factorize each convolution and drastically reduces the number of parameters in our network. Our experiments show that, with this factorized approach, it is possible to achieve state-of-art performance in terms of run-time while we have a small reduction in accuracy.
Palavras-chave: pose estimation, convolutional pose machines, attention layers
Publicado
07/11/2020
SCHIRMER, Luiz; LÚCIO, Djalma; RAPOSO, Alberto; VELHO, Luiz; LOPES, Hélio. A lightweight 2D Pose Machine with attention enhancement. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 33. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 235-242.