Evaluating dead-line predictors efficiency with drowsy technique
Resumo
In recent years, the constant reduction on the transistor size allowed the cache memories to greatly growth in capacity. Nowadays, the cache memories occupy near to 50% of the processor's chip area; this increase was also driven by the memory wall and dark silicon issues. However, this capacity growth influences the energy consumption to maintain the data and operate over such big cache memories. This makes the energy consumed by the caches an important study area. There are many existing methods to save the energy consumed by these caches. In this paper, we evaluate the integration of existing dead cache line predictors and the drowsy cache technique in order to analyze the benefits in terms of energy consumption and execution time.
Referências
H. P. Jeff Preshing, “A look back at single-threaded cpu performance,” 2012. [Online]. Available: http://preshing.com/20120208/a-look-backat-single-threaded-cpu-performance/
K. K. Chang, “Understanding and improving the latency of drambased memory systems,” Ph.D. dissertation, Carnegie Mellon University, Pittsburgh, PA, 5 2017.
M. A. Z. Alves, “Increasing energy efficiency of processor caches via line usage predictors,” Ph.D. dissertation, Rio Grande do Sul Federal University, Porto Alegre, RS, Brazil, 2014.
A.-C. Lai and B. Falsafi, “Selective, accurate, and timely selfinvalidation using last-touch prediction,” in Int. Symp. on Computer Architecture, 2000.
S. Kaxiras, Z. Hu, and M. Martonosi, “Cache decay: exploiting generational behavior to reduce cache leakage power,” in Int. Symp. on Computer Architecture, 2001.
C. F. Chen, S. H. Yang, B. Falsafi, and A. Moshovos, “Accurate and complexity-effective spatial pattern prediction,” in IEE Proc. Software, 2004.
M. Kharbutli and Y. Solihin, “Counter-based cache replacement and bypassing algorithms,” IEEE Trans. on Computers, vol. 57, no. 4, 2008.
S. M. Khan, D. A. Jiménez, D. Burger, and B. Falsafi, “Using dead blocks as a virtual victim cache,” in Int. Conf. on Parallel Architectures and Compilation Techniques, 2010.
M. A. Z. Alves, C. Villavieja, M. Diener, and P. O. A. Navaux, “Energy efficient last level caches via last read/write prediction,” in Int. Symp. on Computer Architecture and High Performance Computing, 2013.
K. Flautner, N. S. Kim, S. Martin et al., “Drowsy caches: simple techniques for reducing leakage power,” in Int. Symp. on Computer Architecture, 2002.
A.-C. Lai, C. Fide, and B. Falsafi, “Dead-block prediction amp; deadblock correlating prefetchers,” in Int. Symp. on Computer Architecture, 2001.
A. Seznec, “A case for two-way skewed-associative caches,” in Int. Symp. on Computer Architecture, 1993.
Intel, “Pin - a dynamic binary instrumentation tool,” 2018. [Online]. Available: https://software.intel.com/en-us/articles/pin-adynamic-binary-instrumentation-tool
H. Lee, L. Jin, K. Lee et al., “Two-phase trace-driven simulation (tpts): a fast multicore processor architecture simulation approach,” Software: Practice and Experience, vol. 40, no. 3, 2010.
N. Muralimanohar, R. Balasubramonian, and N. Jouppi, “Architecting efficient interconnects for large caches with cacti 6.0,” IEEE Micro, vol. 28, no. 1, 2008.
K. T. Malladi, B. C. Lee, F. A. Nothaft et al., “Towards energy proportional datacenter memory with mobile dram,” SIGARCH Comput. Archit. News, vol. 40, no. 3, Jun. 2012.
M. Powell, S.-H. Yang, B. Falsafi et al., “Gated-vdd: A circuit technique to reduce leakage in deep-submicron cache memories,” in Int. Symp. on Low Power Electronics and Design, 2000.
K. Nii, H. Makino, Y. Tujihashi et al., “A low power sram using autobackgate-controlled mt-cmos,” in Int. Symp. on Low Power Electronics and Design, 1998.