Plain: Ferramenta para Desenvolvimento de Aceleradores para Overlays em FPGA na Nuvem em Tempo de Execução

Fernando Passe; Lucas Bragança; Michael Canesche; Felippe Cathoud; José Nacif; Ricardo Ferreira

doi:10.5753/wscad.2020.14054

Fernando Passe UFV
Lucas Bragança UFV
Michael Canesche UFV
Felippe Cathoud UFV
José Nacif UFV
Ricardo Ferreira UFV

DOI: https://doi.org/10.5753/wscad.2020.14054

Resumo

Os FPGAs oferecem eﬁciência energética para o desenvolvimento de aceleradores para ﬂuxo de dados na Nuvem. Porém, existem vários desaﬁos para popularizar seu uso. Dentre eles, podemos citar o tempo de compilação (que pode demorar horas) e conhecimento de hardware para uso adequado de linguagens de síntese de alto nível. Recentemente, a ferramenta READY possibilitou a redução do tempo de compilação e conﬁguração para microsegundos. O ambiente foi validado na plataforma em nuvem HARP 2 da Intel/Altera. Apesar da integração com a Linguagem C++ para o desenvolvimento das aplicações, o acelerador é descrito de forma textual como um grafo. Neste trabalho é apresentado a extensão PLAIN, que inclui uma interface online gráﬁca para descrição dos aceleradores, a automatização do ﬂuxo de projeto, dois níveis de simulação e um nível de execução. A ferramenta também mostra estatísticas de desempenho e permite criação de novos operadores para exploração do espaço de projeto.

Referências

Chin, S. A., Niu, K. P., Walker, M., Yin, S., Mertens, A., Lee, J., and Anderson, J. H. (2018). Architecture exploration of standard-cell and FPGA-Overlay CGRAs using the open-source CGRA-ME framework. In Int. Symposium on Physical Design.

Dave, S. and Shrivastava, A. (2017). CCF: A CGRA compilation framework. https: //github.com/MPSLab-ASU/ccf. Acessado em: 2020-08-11.

Ferreira, R., Cardoso, J. M., and Neto, H. C. (2004). An environment for exploring datadriven architectures. In Int. C. Field Programmable Logic and Applications (FPL).

Ferreira, R., Vendramini, J., and Nacif, M. (2011). Dynamic recongurable multicast interconnections by using radix-4 multistage networks in fpga. In IEEE International Conference on Industrial Informatics.

Franz, M., Lopes, C. T., Huck, G., Sumer, O., and Bader, G. D. (2016). Cytoscape. js: a graph theory library for visualisation and analysis. Bioinformatics, 32(2).

Intel (2020). Intel Xeon with integrated FPGA systems at PC2. https://wikis.uni-paderborn.de/pc2doc/HARP2. Acessado em: 2020-08-11.

JSON (2020). Introducing json. https://www.json.org/json-en.html. Acessado em: 2020-07-25.

Krommydas, K., Sasanka, R., and Feng, W.-c. Bridging the FPGA programmability-portability gap via automatic opencl code generation and tuning. In Int Conf on Application-specic Systems, Architectures and Processors (ASAP). (2016).

Luebbers, E., Liu, S., and Chu, M. (2020). Simplify software integration for fpga accelerators with opae.

Mutigwe, C. and Aghdasi, F. (2013). Instruction set usage analysis for application-specic systems design. Int'l Journal of Information Technology and Computer Science, 7(2).

Nane, R., Sima, V.-M., Pilato, C., Choi, J., Fort, B., Canis, A., Chen, Y. T., Hsiao, H., Brown, S., Ferrandi, F., et al. (2015). A survey and evaluation of fpga high-level synthesis tools. IEEE Trans. on CAD of Integrated Circuits and Systems.

Nickolls, J., Buck, I., Garland, M., and Skadron, K. (2008). Scalable parallel programming with CUDA. Queue, 6(2):40–53.

Nowatzki, T., Gangadhar, V., Ardalani, N., and Sankaralingam, K. (2017). Streamdataow acceleration. In Int. Symposium on Computer Architecture (ISCA).

Penha, J., Silva, L., Silva, J., Coelho, K., Baranda, H., Nacif, J., and Ferreira, R. (2019). ADD: Accelerator design and deploy-a tool for FPGA high-performance dataow computing. Concurrency and Computation: Practice and Experience, 31(18).

Silva, L. B. D., Ferreira, R., Canesche, M., Menezes, M. M., Vieira, M. D., Penha, J., Jamieson, P., and Nacif, J. A. M. (2019). READY: A ne-grained multithreading overlay framework for modern CPU-FPGA dataow applications. ACM Transactions on Embedded Computing Systems (TECS), 18(5s):1–20.

Stanojeviíc, I., Kovaceviíc, M., and Senk, V. (2019). Application of maxeler dataow supercomputing to spherical code design. In Exploring the DataFlow Supercomputing Paradigm, pages 133–168. Springer.

Wijtvliet, M., Waeijen, L., and Corporaal, H. (2016). Coarse grained recongurable architectures in the past 25 years: Overview and classication. In Int. Conf. on Embedded Computer Systems: Architectures, Modeling and Simulation (SAMOS).