Matrix calculations with SIMD floating point instructions on x86 processors

André Muezerie; Raul J. Nakashima; Gonzalo Travieso; Jan Slaets

doi:10.5753/sbac-pad.2001.22192

André Muezerie USP
Raul J. Nakashima USP
Gonzalo Travieso USP
Jan Slaets USP

DOI: https://doi.org/10.5753/sbac-pad.2001.22192

Resumo

This paper describes and evaluates the use of SIMD floating point instructions for scientific calculations. The performance of these instructions is compared with ordinary floating point code. Implementation concerns, the effects of loop unroll as well as matrix size variations are analyzed. Execution speeds are compared using matrix multiplication. The intrinsic incompatibility of the SIMD floating point implementations used by different manufacturers requires the use of two different instruction sets: 3DNOW! on the AMD K6 processor and the Streaming-SIMD Extensions (SSE) on the Intel Pentium III processor.

Palavras-chave: SIMD, 3DNOW!, SSE, vector operations, performance evaluation

Referências

ABEL, James et al. Application Tuning for Streaming SIMD Extensions. Intel Technology Journal Q2, 1999.

AMD White Paper. Enhanced 3DNow!^TM Technology for the AMD Athlon Processor. AMD-52598A Advanced Micro Devices, Inc. October 4,199.

AMD Application Note. 3DNow!^TM lnstruction Porting Guide. AMD Publication #2261, August 1999.

AMD Manual, Extensions to the 3DNow!^TM and MMX lnstruction Sets. AMD-224668 Advanced Micro Devices, Inc. August, 1999.

Asus motherboard documentation. http://asus.com/Products/Motherboard/

BLAS - Basic Linear Algebra Subprograms http://www.netlib.org/blas/

The Cygwin toolpack. http://sourceware.cygnus.com/cygwin/

Diefendorff, Keith. Pemium III = Pentium II + SSE Internet SSE Architecture Boosts Multimedia Performance. Microprocessor Report. v. 13, n.3, March 1999. p.6-11.

Intel, Intel Architecture MMX^TM Technology in Business Applications. Intel Order Number 243367- 002 June 1997.

Application Note. Software Development Strategies For Streaming SIMD Extensions. Intel AP-814 Order Number 243648-002 January 1999.

Mackay, David; Chio, Steven. Streaming SIMD Extensions and General Vector Operations. ISV Performance Lab, Intel Corporation 1999, [link]

The NASM conversion utility. http://www.kernel.org/pub/software/devel/nasm/

The optimizer utility. http://www.imada.ou.dk/~jews/optimizer/