SAPIVe: Simple AVX to PIM Vectorizer


Larger vector extensions are one of the commonly used techniques to meet the growing demands from computational systems. These extensions, capable of operating over multiple data elements with a single instruction, exert a lot of pressure on the memory hierarchy, increasing the impact of growing problems such as Memory-Wall and von Neumann bottleneck. An alternative to work around these problems would be adding processing elements close to the memory, known as Processing-In-Memory (PIM). As with processor vector extensions, the most efficient PIM techniques use in-memory vector processing units. There are several ways to convert a code into in-memory vector processing, such as binary hardware translation, which may not depend on programmers or adapted software and can be carried out transparently to its users. However, in the context of in-memory processing, this conversion technique presents some challenges related to the PIM instructions format and the structure of the loops present in each application. Thus, this article proposes and evaluates Simple AVX to PIM Vectorizer (SAPIVe), a hardware binary translation mechanism from processor vector instructions into in-memory vector instructions, which, in addition to processing more data, also performs loads, operations, and stores at once. Our results show that our mechanism can accelerate kernels up to 5 times with possible performance losses prevented using loop predictors.
Palavras-chave: processing-in-memory, hardware, binary, translator, vectorizer
SOKULSKI, Rodrigo M.; SANTOS, Paulo C.; DOS SANTOS, Sairo R.; ALVES, Marco A. Z.. SAPIVe: Simple AVX to PIM Vectorizer. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SISTEMAS COMPUTACIONAIS (SBESC), 12. , 2022, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 155-162. ISSN 2237-5430.