Avaliação do Consumo Energético em Arquiteturas Multi-Core com Memória Cache Compartilhada

  • Matheus Souza PUC Minas
  • Henrique Freitas PUC Minas
  • Marco Alves UFRGS
  • Philippe Navaux UFRGS

Resumo


Um desafio no projeto de processadores multi-core é obter o melhor nível de eficiência energética possível. Este artigo apresenta resultados relativos ao consumo de energia de uma arquitetura de chip multiprocessado (CMP) simulada, com diferentes modos de compartilhamento de cache L2 sob cargas de trabalho multi-threaded variadas. A diminuição do tamanho da cache e sua distribuição em grupos de núcleos permitiram uma redução de potência consumida de até 18,10% no total e de 70,51% no acesso a memória, porém, com queda de desempenho de até 57,13% e aumento da taxa de faltas na cache L2 de até 93,66%. Em linhas gerais, a versão com 32 bancos de cache L2 privadas consumiu até 53,27% menos energia.

Referências

Alves, M. A., Freitas, H. C., Wagner, F. R., and Navaux, P. O. (2007). Inuência do compartilhamento de cache l2 em um chip multiprocessado sob cargas de trabalho com conjuntos de dados contíguos e não contíguos. VIII Workshop em Sistemas Computacionais de Alto Desempenho, pages 27–34.

Alves, M. A. Z., Freitas, H. C., and Navaux, P. O. A. (2011). High latency and contention on shared l2-cache for many-core architectures. Parallel Processing Letters, 21(1):85–106.

Bienia, C., Kumar, S., and Li, K. (2008). Parsec vs. splash-2: A quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors. In Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on, pages 47–56.

Brooks, D., Tiwari, V., and Martonosi, M. (2000). Wattch: a framework for architectural-level power analysis and optimizations. In Computer Architecture, 2000. Proceedings of the 27th International Symposium on, pages 83–94.

Chtioui, H., Lamih, S. N., Ben-Atitallah, R., M.Zahran, Dekeyser, J., and Abid, M. (2012). Article: A dynamic hybrid cache coherency protocol for shared-memory International Journal of Computer Applications, 47(3):45–50. mpsoc architectures. Published by Foundation of Computer Science, New York, USA.

Ganeshpure, K. P., Polian, I., Kundu, S., and Becker, B. (2009). Reducing temperature variability by routing heat pipes. In Proceedings of the 19th ACM Great Lakes symposium on VLSI, pages 63–68. ACM.

Garcia-Guirado, A., Pascual, R. F., Ros, A., and 0001, J. M. G. (2012). Dapsco: Distanceaware partially shared cache organization. TACO, 8(4):25.

Hennessy, J. L. and Patterson, D. A. (2006). Computer Architecture, Fourth Edition: A Quantitative Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Kin, J., Gupta, M., and Mangione-Smith, W. (1997). The lter cache: an energy efcient memory structure. In Microarchitecture, 1997. Proceedings., Thirtieth Annual IEEE/ACM International Symposium on, pages 184–193.

Kumar, R., Zyuban, V., and Tullsen, D. M. (2005). Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. SIGARCH Comput. Archit. News, 33(2):408–419.

Li, S., Ahn, J. H., Strong, R. D., Brockman, J. B., Tullsen, D. M., and Jouppi, N. P. (2009). Mcpat: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42Nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pages 469–480, New York, NY, USA. ACM.

Nayfeh, B. and Olukotun, K. (1994). Exploring the design space for a shared-cache multiprocessor. In Computer Architecture, 1994., Proceedings the 21st Annual International Symposium on, pages 166–175.

Nogueira, L. O. (2013). New Hardware Support for Transactional Memory and Parallel Debugging in Multicore Processors. PhD thesis, Universidade de Santiago de Compostela.

Olukotun, K., Nayfeh, B. A., Hammond, L., Wilson, K., and Chang, K. (1996). The casefor a single-chip multiprocessor. SIGPLAN Not., 31(9):2–11.

Rawlins, M. and Gordon-Ross, A. (2012). An application classication guided cache tuning heuristic for multi-core architectures. In ASP-DAC, pages 23–28.

Renau, J., Fraguela, B., Tuck, J., Liu, W., Prvulovic, M., Ceze, L., Sarangi, S., Sack, P., Strauss, K., and Montesinos, P. (2005). Sesc simulator. http://sesc.sourceforge.net.

Staples, J. (2011). Resource Banking: An Energy-efcient, Run-time Adaptive Processor Design Technique. PhD thesis, University of Central Florida Orlando, Florida.

Sundararajan, K. T., Jones, T. M., and Topham, N. P. (2013). The smart cache: An energy-efcient cache architecture through dynamic adaptation. International Journal of Parallel Programming, 41(2):305–330.

Tarjan, D., Thoziyoor, S., and Jouppi, N. P. (2006). Cacti 4.0. Technical Report HPL-2006-86, HP Laboratories.

Weng, L. (2012). A hardware and software integrated approach for adaptive thread management in multicore multithreaded microprocessors. Master's thesis, FIU Electronic Theses and Dissertations.

Wolf, W. (2004). The future of multiprocessor systems-on-chips. In Design Automation Conference, 2004. Proceedings. 41st, pages 681–685.

Woo, S., Ohara, M., Torrie, E., Singh, J., and Gupta, A. (1995). The splash-2 programs: characterization and methodological considerations. In Computer Architecture, 1995. Proceedings., 22nd Annual International Symposium on, pages 24–36.

Youn, S., Kim, H., and Kim, J. (2007). A reusability-aware cache memory sharing technique for high-performance low-power cmps with private l2 caches. In Low Power Electronics and Design (ISLPED), 2007 ACM/IEEE International Symposium on, pages 56–61.
Publicado
28/07/2014
SOUZA, Matheus; FREITAS, Henrique; ALVES, Marco; NAVAUX, Philippe. Avaliação do Consumo Energético em Arquiteturas Multi-Core com Memória Cache Compartilhada. In: WORKSHOP EM DESEMPENHO DE SISTEMAS COMPUTACIONAIS E DE COMUNICAÇÃO (WPERFORMANCE), 13. , 2014, Brasília. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2014 . p. 1-13. ISSN 2595-6167.