Avaliação do Consumo Energético em Arquiteturas Multi-Core com Memória Cache Compartilhada

  • Matheus Souza PUC Minas
  • Henrique Freitas PUC Minas
  • Marco Alves UFRGS
  • Philippe Navaux UFRGS

Abstract


A design challenge for multi-core processors is to obtain the best possible level of energy efficiency. This paper presents results concerning the energy consumption of a simulated chip-multiprocessor architecture (CMP) with different shared L2 cache models under different multi-threaded workloads. The decrease in cache size and its distribution in groups of processing cores made possible an overall reduction in power consumption up to 18.10%, and 70.51% at the memory access, with decrease in performance up to 57.13% and increase in L2 cache misses up to 93.66%. In general, the version with 32 L2 private cache banks consumed up to 53.27% less energy.

References

Alves, M. A., Freitas, H. C., Wagner, F. R., and Navaux, P. O. (2007). Inuência do compartilhamento de cache l2 em um chip multiprocessado sob cargas de trabalho com conjuntos de dados contíguos e não contíguos. VIII Workshop em Sistemas Computacionais de Alto Desempenho, pages 27–34.

Alves, M. A. Z., Freitas, H. C., and Navaux, P. O. A. (2011). High latency and contention on shared l2-cache for many-core architectures. Parallel Processing Letters, 21(1):85–106.

Bienia, C., Kumar, S., and Li, K. (2008). Parsec vs. splash-2: A quantitative comparison of two multithreaded benchmark suites on chip-multiprocessors. In Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on, pages 47–56.

Brooks, D., Tiwari, V., and Martonosi, M. (2000). Wattch: a framework for architectural-level power analysis and optimizations. In Computer Architecture, 2000. Proceedings of the 27th International Symposium on, pages 83–94.

Chtioui, H., Lamih, S. N., Ben-Atitallah, R., M.Zahran, Dekeyser, J., and Abid, M. (2012). Article: A dynamic hybrid cache coherency protocol for shared-memory International Journal of Computer Applications, 47(3):45–50. mpsoc architectures. Published by Foundation of Computer Science, New York, USA.

Ganeshpure, K. P., Polian, I., Kundu, S., and Becker, B. (2009). Reducing temperature variability by routing heat pipes. In Proceedings of the 19th ACM Great Lakes symposium on VLSI, pages 63–68. ACM.

Garcia-Guirado, A., Pascual, R. F., Ros, A., and 0001, J. M. G. (2012). Dapsco: Distanceaware partially shared cache organization. TACO, 8(4):25.

Hennessy, J. L. and Patterson, D. A. (2006). Computer Architecture, Fourth Edition: A Quantitative Approach. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.

Kin, J., Gupta, M., and Mangione-Smith, W. (1997). The lter cache: an energy efcient memory structure. In Microarchitecture, 1997. Proceedings., Thirtieth Annual IEEE/ACM International Symposium on, pages 184–193.

Kumar, R., Zyuban, V., and Tullsen, D. M. (2005). Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. SIGARCH Comput. Archit. News, 33(2):408–419.

Li, S., Ahn, J. H., Strong, R. D., Brockman, J. B., Tullsen, D. M., and Jouppi, N. P. (2009). Mcpat: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42Nd Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 42, pages 469–480, New York, NY, USA. ACM.

Nayfeh, B. and Olukotun, K. (1994). Exploring the design space for a shared-cache multiprocessor. In Computer Architecture, 1994., Proceedings the 21st Annual International Symposium on, pages 166–175.

Nogueira, L. O. (2013). New Hardware Support for Transactional Memory and Parallel Debugging in Multicore Processors. PhD thesis, Universidade de Santiago de Compostela.

Olukotun, K., Nayfeh, B. A., Hammond, L., Wilson, K., and Chang, K. (1996). The casefor a single-chip multiprocessor. SIGPLAN Not., 31(9):2–11.

Rawlins, M. and Gordon-Ross, A. (2012). An application classication guided cache tuning heuristic for multi-core architectures. In ASP-DAC, pages 23–28.

Renau, J., Fraguela, B., Tuck, J., Liu, W., Prvulovic, M., Ceze, L., Sarangi, S., Sack, P., Strauss, K., and Montesinos, P. (2005). Sesc simulator. http://sesc.sourceforge.net.

Staples, J. (2011). Resource Banking: An Energy-efcient, Run-time Adaptive Processor Design Technique. PhD thesis, University of Central Florida Orlando, Florida.

Sundararajan, K. T., Jones, T. M., and Topham, N. P. (2013). The smart cache: An energy-efcient cache architecture through dynamic adaptation. International Journal of Parallel Programming, 41(2):305–330.

Tarjan, D., Thoziyoor, S., and Jouppi, N. P. (2006). Cacti 4.0. Technical Report HPL-2006-86, HP Laboratories.

Weng, L. (2012). A hardware and software integrated approach for adaptive thread management in multicore multithreaded microprocessors. Master's thesis, FIU Electronic Theses and Dissertations.

Wolf, W. (2004). The future of multiprocessor systems-on-chips. In Design Automation Conference, 2004. Proceedings. 41st, pages 681–685.

Woo, S., Ohara, M., Torrie, E., Singh, J., and Gupta, A. (1995). The splash-2 programs: characterization and methodological considerations. In Computer Architecture, 1995. Proceedings., 22nd Annual International Symposium on, pages 24–36.

Youn, S., Kim, H., and Kim, J. (2007). A reusability-aware cache memory sharing technique for high-performance low-power cmps with private l2 caches. In Low Power Electronics and Design (ISLPED), 2007 ACM/IEEE International Symposium on, pages 56–61.
Published
2014-07-28
SOUZA, Matheus; FREITAS, Henrique; ALVES, Marco; NAVAUX, Philippe. Avaliação do Consumo Energético em Arquiteturas Multi-Core com Memória Cache Compartilhada. In: WORKSHOP ON PERFORMANCE OF COMPUTER AND COMMUNICATION SYSTEMS (WPERFORMANCE), 13. , 2014, Brasília. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2014 . p. 1-13. ISSN 2595-6167.