Sparse Fast Fourier Transform on GPUs and Multi-core CPUs

  • Jiaxi Hu University of Minnesota
  • Zhaosen Wang University of Minnesota
  • Qiyuan Qiu University of Minnesota
  • Weijun Xiao Virginia Commonwealth University
  • David J. Lilja University of Minnesota

Resumo


Given an N-point sequence, finding its k largest components in the frequency domain is a problem of great interest. This problem, which is usually referred to as a sparse Fourier Transform, was recently brought back on stage by a newly proposed algorithm called the sFFT. In this paper, we present a parallel implementation of sFFT on both multi-core CPUs and GPUs using a human voice signal as a case study. Using this example, an estimate of k for the 3dB cutoff points was conducted through concrete experiments. In addition, three optimization strategies are presented in this paper. We demonstrate that the multi-core-based sFFT achieves speedups of up to three times a single-threaded sFFT while a GPU-based version achieves up to ten times speedup. For large scale cases, the GPU-based sFFT also shows its considerable advantages, which is about 40 times speedup compared to the latest out-of-card FFT implementations [2].
Palavras-chave: Graphics processing units, Instruction sets, Kernel, Estimation, Libraries, Frequency domain analysis, Human voice, Sparse FFT, GPUs, Multi-core CPUs, performance speedup
Publicado
24/10/2012
HU, Jiaxi; WANG, Zhaosen; QIU, Qiyuan; XIAO, Weijun; LILJA, David J.. Sparse Fast Fourier Transform on GPUs and Multi-core CPUs. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 24. , 2012, Nova Iorque/EUA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2012 . p. 83-91.