Eficiência Energética em Computação de Alto Desempenho: Uma Abordagem em Arquitetura e Programação para Green Computing

Stéfano D. K. Mór; Marco A. Z. Alves; João V. F. Lima; Nicolas B. Maillard; Philippe O. A. Navaux

Stéfano D. K. Mór UFRGS
Marco A. Z. Alves UFRGS
João V. F. Lima UFRGS
Nicolas B. Maillard UFRGS
Philippe O. A. Navaux UFRGS

Resumo

Este artigo apresenta uma visão sob o ponto de vista de arquitetura de computadores e programação paralela frente à demanda por plataformas paralelas de alto desempenho e baixo consumo energético. Apresenta-se a visão do Grupo de Processamento Paralelo e Distribuído (GPPD) da UFRGS sobre a relação entre um dos principais fatores na construção de “sistemas verdes” (green computing), o baixo consumo de energia elétrica, com a alta eficiência computacional. Esse contexto é relacionado com os desafios da pesquisa brasileira na Computação propostos pela Sociedade Brasileira de Computação em 2006. Mostramos que o paralelismo é um fator chave no processamento distribuído de grandes volumes de dados e que esse processo pode ser realizado com um consumo eficiente e escalável de energia. Apresentamos trabalhos que conseguem regular o consumo de energia dispendido através do máximo aproveitamento de processadores ativos (no nível de programação) e, também, estudos que visam a redução do consumo de potência dos componentes presentes em um ambiente paralelo (no nível de arquitetura). Ao final, apresentamos nossas perspectivas para o uso de paralelismo como ferramenta na obtenção de um consumo racional de energia nos próximos anos.

Referências

Alves, M. A. Z. (2009). Avaliação do compartilhamento das memórias Cache no desempenho de arquiteturas Multi-Core. PPGC / UFRGS -Programa de Pós-Graduação em Computação da Universidade Federal do Rio Grande do Sul.

Alves, M. A. Z., Freitas, H. C., and Navaux, P. O. A. (2009). Investigation of shared l2 cache on many-core processors. In Proceedings Workshop on Many-Core, pages 21–30, Berlin. VDE Verlag GMBH.

Asanovic, K., Bodik, R., Demmel, J., Keaveny, T., Keutzer, K., Kubiatowicz, J., Morgan, N., Patterson, D., Sen, K., Wawrzynek, J., Wessel, D., and Yelick, K. (2009). A view of the parallel computing landscape. Communications of the ACM, 52(10).

Balaji, P., Buntinas, D., Goodwell, D., Gropp, W., and Thakur, R. (2008). Toward efficient support for multithreaded mpi communication. In Proc. of The 15th European PVM/MPI Users’ Group Conference, EuroPVM/MPI 2008, pages 120–129, Dublin, IRL. Springer-Verlag.

Bardine, A., Comparetti, M., Foglia, P., Gabrielli, G., Prete, C. A., and Stenströ andm, P. (2008). Leveraging data promotion for low power d-nuca caches. In Digital System Design Architectures, Methods and Tools, 2008. DSD ’08. 11th EUROMICRO Conference on, pages 307–316.

Binder, W. and Suri, N. (2009). Green computing: Energy consumption optimized service hosting. In SOFSEM ’09: Proceedings of the 35th Conference on Current Trends in Theory and Practice of Computer Science, pages 117–128, Berlin, Heidelberg. Springer-Verlag.

Bjerregaard, T. and Mahadevan, S. (2006). A survey of research and practices of network-on-chip. ACM Computing Surveys, 38:1–51.

Cera, M. C., Maillard, N., and Navaux, P. O. A. (2007). A Centralized and On-line Scheduling Solution to Dynamic MPI Programs. V Workshop de Processamento Paralelo e Distribuído (WSPPD 2007), pages 25–30.

De Micheli, G. and Benini, L. (2006). Networks on Chips: Technology and Tools. Morgan Kaufmann.

Dongarra, J., Foster, I., Fox, G., Gropp, W., Kennedy, K., Torczon, L., and White, A. (2003). Sourcebook of Parallel Computing. Morgan Kaufmann Publishers Inc., San Francisco, USA.

Feng, W. (2003). Green Destiny + mpiBLAST = Bioinfomagic. In 10th International Conference on Parallel Computing (ParCo).

Feng, W., Feng, X., and Ge, R. (2008). Green supercomputing comes of age. IT Professional, 10:17–23.

Flynn, M. J. (1972). Some computer organization and their effectiveness. In IEEE Transactions on Computers, volume C-21.

Foster, I. (1995). Designing and Building Paralllel Programs. Addison-Wesley.

Freitas, H. C., Colombo, D. M., Kastensmidt, F. L., and Navaux, P. O. A. (2007). Evaluating network-on-chip for homogeneous embedded multiprocessors in fpgas. In Proceedings ISCAS: Int. Symp. on Circuits and Systems, pages 3776–3779.

Hsu, C.-H., Feng, W., and Archuleta, J. S. (2005). Towards efficient supercomputing: A quest for the right metric. Parallel and Distributed Processing Symposium, International, 12:230a.

Jin, H., Frumkin, M., and Yan, J. (1999). The openmp implementation of nas parallel benchmarks and its performance. In Technical Report: NAS-99-011.

Kassick, R., Machado, C., Hermann, E., Ávila, R., Navaux, P., and Denneulin, Y. (2005). Evaluating the performance of the dnfsp file system. In Proc. of the 5th IEEE International Symposium on Cluster Computing and the Grid, CCGrid, Cardiff, UK. Los Alamitos, IEEE Computer Society Press. CD-ROM Proceedings, ISBN 0-7803-9075-X.

Kim, C., Burger, D., and Keckler, S. Q. (2002). An adaptive, non-uniform cache structure for wire-delay dominated on-chip-caches. In Proceedings Int. Conf. on Architectural Support for Programming Languages and Operating Systems, pages 211–222. IEEE.

Lima, J. and Maillard, N. (2009). Online mapping of mpi-2 dynamic tasks to processes and threads. International Journal of High Performance Systems Architecture, 2(2):81–89.

Manzanares, A., Bellam, K., and Qin, X. (2008). A prefetching scheme for energy conservation in parallel disk systems. In Proceedings of The 22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pages 1–5.

Marino, M. D. (2006). 32-core cmp with multi-sliced l2: 2 and 4 cores sharing a l2 slice. In Proceedings SBAC-PAD: Int. Symp. on Computer Architecture and High Performance Computing, pages 141–150. IEEE.

Mór, S. D. K. and Maillard, N. B. (2009). Melhorando o desempenho de algoritmos do tipo branch & bound em mpi via escalonador com roubo aleatório de tarefas. In SBC, editor, Anais do X Simpósio em Sistemas Computacionais, WSCAD-SSC 2009, pages 11–18, São Paulo, BRA. SBC.

Navaux, P. O. A. and DeRose, C. A. F. (2003). Arquiteturas Paralelas. Number 15 in Série Livros Didáticos. Editora Sagra Luzzato, 1 edition.

Olukotun, K., Nayfeh, B. A., Hammond, L., Wilson, K., and Chang, K. (1996). The case for a single-chip multiprocessor. In Proceedings ASPLOS: Int. Symp. on Architectural Support for Programming Languages and Operating Systems, pages 2–11. IEEE.

Pezzi, G. P., Cera, M., Mathias, E., and Maillard, N. (2007). On-line scheduling of mpi-2 programs with hierarchical work stealing. Computer Architecture and High Performance Computing, 2007. SBAC-PAD 2007. 19th International Symposium on, pages 247–254.

Smith, J. E. and Sohi, G. S. (1995). The microarchitecture of superscalar processors. IEEE, 83(12):1609–1624.

Thoziyoor, S., Ahn, J. H., Monchiero, M., Brockman, J. B., and Jouppi, N. P. (2008). A comprehensive memory modeling tool and its application to the design and analysis of future memory hierarchies. In Proceedings ISCA: Int. Symp. on Computer Architecture, pages 51–62.

Torres, J., Carrera, D., Hogan, K., Gavaldà, R., Beltran, V., and Poggi, N. (2008). Reducing wasted resources to help achieve green data centers. In Proceedings of The 22nd IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2008, pages 1–8.

Vykoukal, J., Wolf, M., and Beck, R. (2009). Does green it matter? analysis of the relationship between green it and grid technology from a resource-based view perspective. In PACIS 2009 Proceedings, page paper 51.

Wang, D. (2007). Meeting green computing challenges. In International Symposium on High Density packaging and Microsystem Integration, pages 1–4.