Geração automática de hardware a partir de programas descritos em linguagem C com pragmas
Resumo
Compiladores de síntese de alto nível vem se popularizando. Esses permitem transformar códigos de alto nível em hardware de maneira simples e rápida. As soluções atuais geram hardware que não exploram as técnicas que permitam melhorar o pipeline em hardware. Este trabalho apresenta o compilador LALPC que utiliza técnicas para explorar paralelismo em FPGAs a partir de projetos descritos em linguagem C. As técnicas permitem identificar e aplicar otimizações para acelerar trechos de códigos baseados em loops. LALPC é capaz de gerar sistemas de alto desempenho, permitindo a exploração de espaço de projeto pelo programador.
Referências
Buyukkurt, B., Guo, Z., and Najjar, W. (2006). Impact of loop unrolling on area, throughput and clock frequency in ROCCC: C to VHDL compiler for FPGAs. Recongurable Computing: Architectures and Applications, pages 401–412.
Canis, A., Choi, J., Aldham, M., Zhang, V., Kammoona, A., Anderson, J. H., Brown, S., and Czajkowski, T. (2011). LegUp: high-level synthesis for FPGA-based processor/accelerator systems. In Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays, pages 33–36. ACM.
Cardoso, J., Diniz, P., de Figueiredo Coutinho, J., and Petrov, Z. (2013). Compilation and Synthesis for Embedded Recongurable Systems: An Aspect-Oriented Approach. Springer London, Limited.
Cardoso, J. M. P. (2000). Compilação de Algoritmos em Java para Sistemas Computacionais Reconguráveis com Exploração do Paralelismo ao Nível das Operações. PhD thesis, Universidade Técnica de Lisboa.
Choi, J., Brown, S., and Anderson, J. (2013). From software threads to parallel hardware In Field-Programmable Technology (FPT), 2013 in high-level synthesis for fpgas. International Conference on, pages 270–277. IEEE.
Cong, J., Liu, B., Neuendorffer, S., Noguera, J., Vissers, K., and Zhang, Z. (2011). HighLevel Synthesis for FPGAs: From Prototyping to Deployment. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 30(4):473–491.
Coutinho, J., Cardoso, J., Carvalho, T., Nobre, R., Bhattacharya, S., Diniz, P., Fitzpatrick, L., and Nane, R. (2013a). Deriving resource efcient designs using the reect aspect-oriented approach. In Brisk, P., Figueiredo Coutinho, J., and Diniz, P., editors, Recongurable Computing: Architectures, Tools and Applications, volume 7806 of Lecture Notes in Computer Science, pages 226–228. Springer Berlin Heidelberg.
Coutinho, J. G. F., Cardoso, J. a. M. P., Carvalho, T., Nobre, R., Bhattacharya, S., Diniz, P. C., Fitzpatrick, L., and Nane, R. (2013b). Deriving resource efcient designs using the reect aspect-oriented approach. In Proceedings of the 9th international conference on Recongurable Computing: architectures, tools, and applications, ARC’13, pages 226–228, Berlin, Heidelberg. Springer-Verlag.
Coutinho, J. G. F. and Luk, W. (2003). Source-directed transformations for hardware compilation. In Field-Programmable Technology (FPT), 2003. Proceedings. 2003 IEEE International Conference on, pages 278–285. IEEE.
Feist, T. (2012). Vivado design suite. Xilinx, White Paper Version, 1.
Guthaus, M. R., Ringenberg, J. S., Ernst, D., Austin, T. M., Mudge, T., and Brown, R. B. (2001). MiBench: A free, commercially representative embedded benchmark suite. In WWC ’01: Proceedings of the Workload Characterization, 2001. WWC-4. 2001 IEEE International Workshop, pages 3–14, Washington, DC, USA. IEEE Computer Society.
Liao, C., Quinlan, D. J., Willcock, J. J., and Panas, T. (2010). Semantic-aware automatic parallelization of modern applications using high-level abstractions. International Journal of Parallel Programming, 38(5-6):361–378.
Menotti, R., Cardoso, J. M. P., Fernandes, M. M., and Marques, E. (2012). LALP: A Language to Program Custom FPGA-based Acceleration Engines. International Journal of Parallel Programming, 40(3):262–289.
Putnam, A., Bennett, D., Dellinger, E., Mason, J., Sundararajan, P., and Eggers, S. (2008). Chimps: A c-level compilation ow for hybrid cpu-fpga architectures. In Field Programmable Logic and Applications, 2008. FPL 2008. International Conference on, pages 173–178. IEEE.
Putnam, A., Cauleld, A. M., Chung, E. S., Chiou, D., Constantinides, K., Demme, J., Esmaeilzadeh, H., Fowers, J., Gopal, G. P., Gray, J., Haselman, M., Hauck, S., Heil, S., Hormati, A., Kim, J.-Y., Lanka, S., Larus, J., Peterson, E., Pope, S., Smith, A., Thong, J., Xiao, P. Y., and Burger, D. (2014). A recongurable fabric for accelerating In Computer Architecture (ISCA), 2014 ACM/IEEE large-scale datacenter services. 41st International Symposium on, pages 13–24.
Quinlan, D. (2000). ROSE: Compiler support for object-oriented frameworks. Parallel Processing Letters, 10(02n03):215–226.
Santarini, M. (2011). Zynq-7000 EPP sets stage for new era of innovations. Xcell J, 75(2):8–13.
Terasic (2013). DE2i-150 FPGA System User Manual. Terasic Technologies Inc.
Texas (2003a). TMS320C64x DSP Library: Programmer’s Reference. Texas Instruments Incorporated.
Texas (2003b). TMS320C64x Image/Video Processing Library: Programmer’s Reference. Texas Instruments Incorporated.
Vanderbauwhede, W. and Benkrid, K. (2013). High-Performance Computing Using FPGAs. Springer.