Fast parallel FFT on a reconfigurable computation platform
Resumo
We present implementation of a very fast parallel complex FFT on M2, the second generation of MorphoSys reconfigurable computation platform, which is targeting on streamed applications such as multimedia and DSP. The proposed mapping comprises fast presorting, cascaded radix-2 stages, and postreordering. Data and twiddle factors are 16-bit real and 16-bit imaginary in 2's complement format and scaling is performed to avoid overflow. The mapping is tested on our cycle-accurate simulator, "mulate", and the performance is encouragingly better than other architectures such as Imagine and VIRAM. Moreover, the performance is scalable according to FFT sizes. Since there is no functionality specifically tailored to FFT, the results demonstrate the capability of MorphoSys architecture to extract parallelism from streamed applications. Further rationales are given based on the concepts of scalar operand networks and memory hierarchy.
Palavras-chave:
Concurrent computing, Streaming media, Computer architecture, Application software, Application specific integrated circuits, Digital signal processing, Parallel processing, Scalability, Delay, Bandwidth
Publicado
10/11/2003
Como Citar
KAMALIZAD, A. H.; PAN, C.; BAGHERZADEH, N..
Fast parallel FFT on a reconfigurable computation platform. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 15. , 2003, São Paulo/SP.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2003
.
p. 254-259.
