Performance analysis of matrix calculus in parallel systems using AVX-512
Abstract
Due to software optimization processes arising from more recent technologies, this study seeks to analyze the advantage of hardware-based vectorization implementations, i.e., AVX2 and AVX-512, in a matrix multiplication scenario. The results show that vectorization brings very expressive gains, highlighting the AVX-512 advantages.
Keywords:
Parallel and Distributed Algorithms, Machine Learning, Data Science, High-Performance Computing
References
Capra, M., Bussolino, B., Marchisio, A., Masera, G., Martina, M., and Shafique, M. (2020). Hardware and software optimizations for accelerating deep neural networks: Survey of current trends, challenges, and the road ahead. IEEE Access, 8:225134-225180.
Cornea, M. (2015). Intel avx-512 instructions and their use in the implementation of math functions. Intel Corporation, pages 1-20.
Libório, André e Baldassin, A. (2021). Análise de desempenho do cálculo matricial em sistemas paralelos utilizando openmp. In Anais da XII Escola Regional de Alto Desempenho de São Paulo, pages 13-16. SBC.
M. Müller, B. Supinski, B. C. (2009). Evolving OpenMP in an Age of Extreme Parallelism. Springer.
Rathore, Y. and Kumar, D. (2014). Performance evaluation of matrix multiplication using openmp for single dual and multi-core machines. IOSR Journal of Engineering (IOSRJEN), 4:56-59.
Cornea, M. (2015). Intel avx-512 instructions and their use in the implementation of math functions. Intel Corporation, pages 1-20.
Libório, André e Baldassin, A. (2021). Análise de desempenho do cálculo matricial em sistemas paralelos utilizando openmp. In Anais da XII Escola Regional de Alto Desempenho de São Paulo, pages 13-16. SBC.
M. Müller, B. Supinski, B. C. (2009). Evolving OpenMP in an Age of Extreme Parallelism. Springer.
Rathore, Y. and Kumar, D. (2014). Performance evaluation of matrix multiplication using openmp for single dual and multi-core machines. IOSR Journal of Engineering (IOSRJEN), 4:56-59.
Published
2022-04-07
How to Cite
LIBÓRIO, André; BALDASSIN, Alexandro; PAPA, João Paulo.
Performance analysis of matrix calculus in parallel systems using AVX-512. In: REGIONAL SCHOOL OF HIGH PERFORMANCE COMPUTING FROM SÃO PAULO (ERAD-SP), 13. , 2022, Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2022
.
p. 17-20.
DOI: https://doi.org/10.5753/eradsp.2022.222245.
