A Comprehensive Performance Evaluation to GPGPU Applications under STT-RAM based Hybrid Cache Architectures

Jingjing Fu; Yu Liu

Jingjing Fu Clarkson University
Yu Liu Clarkson University

Resumo

Nowadays, general purpose Graphic Processing Units (GPGPUs) have become the technical trend for complex science and engineering computing in the exascale, which is through its unique capability of massive parallel computing based on the many-core architecture. Also, the occurrence probability of soft errors caused by particle strike on the large-scale computing system built by GPGPUs has been boosted significantly. SpinTransfer Torque RAM (STT-RAM) benefits from its unique way of carrying information through a Magnetic Tunnel Junction (MTJ), and then it is a feasible soft error resilient solution due to its immunity to soft errors. However, STT-RAM suffers from the large overhead of latency and energy consumption on write operations, and thus results in hesitating of adopting STT-RAM into memory system design. Therefore, it is very necessary to do a comprehensive performance evaluation of adopting the STTRAM into the memory hierarchy of the GPGPU architecture (i.e., hybrid STT-RAM/SRAM cache architectures). This work offers a fair and comprehensive performance evaluation for GPGPU applications based on different cache associativities and multiple plans of partial or complete adopting STT-RAM into the memory hierarchy of the GPGPU, which could offer useful options for the soft error resilient GPGPU architecture design. In addition, this work encloses that a proper combination of cache configuration and adoption plan may result in only slight timing performance drop and equivalent energy consumption performance, while taking advantage of the soft error resilience.

Palavras-chave: GPGPU, STT-RAM, Cache, Timing, Energy Consumption, Soft Error Resilience

Referências

S. Greengard "Gpus reshape computing" Communications of the ACM vol. 59 no. 9 September 2016.

B. Fang and K. Pattabiraman "Gpu-qin: A methogology for evaluatng the error resilience of gpgpu applications" IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) 2014.

K. Colins L. Li and Y. Liu "Analysis of a statistical relationship between dose and error tallies in semiconductor digital integrated circuits for application to radiation monitoring over a wireless sensor network" IEEE Transactions on Nuclear Science vol. 64 no. 5 2017.

Y. Liu M. Nishimura L. Li and K. Colins "Study on a low-cost and large-scale environmentally adaptive protocol stack of nuclear and space wireless sensor network applications under gamma radiation" Nuclear Technology vol. 197 no. 1 2017.

M. Nishimura Y. Liu L. Li and K. Colins "Comparison of wireless sensor network routing protocols under gamma radiation for nuclear and space applications" Nuclear Technology vol. 195 no. 2 2016.

X. Li M. C. Huang and K. Shen "A realistic evaluation of memory hardware errors and software system susceptibility" The 2010 USENIX Conference on USENIX Annual Technical Conference 2010.

N. Inc "Nvidia tesla v100 gpu accelerator datasheet" NVIDIA March 2018.

S. Mittal "A survey of architectural techniques for improving cache power efficiency" Elesvier Sustainable Computing: Information Systems vol. 4 no. 1 2014.

G. Sun E. Kursun and J. A. Rivers "Exploring the vulnerability of cmps to soft errors with 3d stacked nonvolatile memory" ACM Journals on Emerging Technologies in Computing Systems vol. 9 no. 3 2013.

W. Zhao E. Belhaire and Q. Mistral "Macro-model of spin-transfer torque based magnetic tunnel junction device for hybrid magnetic-cmos design" 2006 IEEE International Behavioral Modeling and Simulation Workshop 2006.

P. Wang G. Sun and T. Wang "Designing scratchpad memory architecture with emerging stt-ram memory technologies" IEEE International Symposium on Circuits and Systems 2013.

M.-T. Chang P. Rosenfeld and S.-L. Lu "Technology comparison for large last-level caches (13cs): Low-leakage sram low write-energy stt-ram and refresh-optimized edram" The 19th IEEE International Symposium on High-Performance Computer Architecture 2013.

J. Zhan O. Kayiran and G. H. Loh "Oscar: Orchestrating stt-ram cache traffic for heterogeneous cpu-gpu architectures" The 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) 2016.

B. Nie D. Tiwari† and S. Gupta "A large-scale study of soft-errors on gpus in the field" The 22nd IEEE International Symposium on High Performance Computer Architecture 2016.

S. Mittal H. Wang and A. Jog "Design and analysis of soft-error resilience mechanisms for gpu register file" The 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems (VLSID) 2017.

J. Tan Z. Li and X. Fu "soft-err_or reliability and power co-optimization for gpgpus register file using resistive memory" 2015 Design Automation Test in Europe Conference Exhibition (DATE) 2015.

W. W. L. F. T. M. Aamodt and A. Boktor "Gpgpu-sim 3.x: A performance simulator for many-core accelerator research" The 39th IEEE International Symposium on Computer Architecture 2012.

X. Dong C. Xu and Y. Xie "Nvsim: A circuit-level performance energy and area model for emerging nonvolatile memory" IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems vol. 31 no. 7 2012.

S. Che M. Boyer and J. Meng "Rodinia: A benchmark suite for heterogeneous computing" IEEE International Symposium on Workload Characterization (IISWC) 2009.