Applying CUDA Architecture to Accelerate Full Search Block Matching Algorithm for High Performance Motion Estimation in Video Encoding

  • Eduarda Monteiro UFRGS
  • Bruno Vizzotto UFRGS
  • Cláudio Diniz UFRGS
  • Bruno Zatt UFRGS
  • Sergio Bampi UFRGS

Resumo


This work presents a parallel GPU-based solution for the Motion Estimation (ME) process in a video encoding system. We propose a way to partition the steps of Full Search block matching algorithm in the CUDA architecture. A comparison among the performance achieved by this solution with a theoretical model and two other implementations (sequential and parallel using OpenMP library) is made as well. We obtained a O(n^2/log^2n) speed-up which fits the proposed theoretical model considering different search areas. It represents up to 600x gain compared to the serial implementation, and 66x compared to the parallel OpenMP implementation.
Palavras-chave: Graphics processing unit, Complexity theory, Motion estimation, Computer architecture, Algorithm design and analysis, Encoding, Accuracy, Motion Estimation, H.264/AVC, GPU, CUDA
Publicado
26/10/2011
MONTEIRO, Eduarda; VIZZOTTO, Bruno; DINIZ, Cláudio; ZATT, Bruno; BAMPI, Sergio. Applying CUDA Architecture to Accelerate Full Search Block Matching Algorithm for High Performance Motion Estimation in Video Encoding. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 23. , 2011, Vitória/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2011 . p. 128-135.