Exploiting the Potential of Flexible Processing Units
Resumo
In order to meet the increased computational demands and stricter power constraints of modern applications, architectures have evolved to include domain-specific accelerators. In order to design efficient accelerators, three main challenges need to be addressed: compute, memory, and control. Moreover, since SoCs usually contain multiple accelerators, selecting the right one for each task also become crucial. This becomes specially relevant in Flexible Processing Units (xPUs), processing units that provide multiple functionalities with the same hardware. While it is possible to use shared support components for all functionalities, this will lead to sub-optimal performance. In this work, we take one example of such xPU, and analyze the aspects which have not yet been fully addressed, showing that there is more potential to be exploited. By understanding the required memory patterns, we can achieve up to 72% speedup gains compared to using the memory support optimized for a different functionality. Furthermore, we propose an in-depth analysis of the different functionalities provided by the xPU. We then leverage the insights obtained from this analysis by providing a mechanism that selects the right functionality, maximizing hardware utilization.
Palavras-chave:
Flexible Processing Unit, Vector Unit, Systolic Array, GEMM, DNN, Scientific Computing
Publicado
17/10/2023
Como Citar
VÁZQUEZ, Mateo; AZHAR, Muhammad Waqar; TRANCOSO, Pedro.
Exploiting the Potential of Flexible Processing Units. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 35. , 2023, Porto Alegre/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2023
.
p. 34-45.