Mitigando o Impacto da Degradação do Processador via Multiprogramação

Mariana Costa; Sandro M. V. N. Marques; Fábio D. Rossi; Marcelo C. Luizelli; Antonio Carlos S. Beck; Arthur F. Lorenzon

doi:10.5753/wscad.2021.18515

Mariana Costa UNIPAMPA
Sandro M. V. N. Marques UNIPAMPA
Fábio D. Rossi IFFar
Marcelo C. Luizelli UNIPAMPA
Antonio Carlos S. Beck UFRGS
Arthur F. Lorenzon UNIPAMPA

DOI: https://doi.org/10.5753/wscad.2021.18515

Resumo

O número de núcleos em um único chip tem aumentado a cada nova geração de processadores para satisfazer a demanda de desempenho de aplicações modernas. Entretanto, a potência consumida por área também tem aumentado, influenciando a temperatura de operação e acelerando os fenômenos responsáveis pela degradação dos processadores. Neste sentido, controlar a temperatura dos sistemas computacionais é essencial para aumentar a vida útil dos recursos computacionais. Sendo assim, nós propomos PampaAging: uma abordagem dinâmica, automática e transparente que realiza o ajuste do número de threads e a alocação do recursos de hardware para execução concorrente de um conjunto de aplicações com objetivo de maximizar a vida útil dos componentes de hardware enquanto também otimiza o desempenho das aplicações paralelas. Com a execução de vinte e quatro aplicações em duas arquiteturas multicore (Intel e AMD), mostramos que PampaAging consegue melhorar em até 42% a vida útil do processador e o desempenho em 2.52 vezes em comparação à maneira padrão que aplicações paralelas são executadas.

Referências

Amrouch, H., van Santen, V. M., Ebi, T., Wenzel, V., and Henkel, J. (2014). Towards interdependencies of aging mechanisms. In IEEE/ACM Int. Conf. on Computer-Aided Design (ICCAD), pages 478–485.

Bailey, D. H., Barszcz, E., Barton, J. T., Browning, D. S., Carter, R. L., Dagum, L., Fatoohi, R. A., Frederickson, P. O., Lasinski, T. A., Schreiber, R. S., Simon, H. D., Venkatakrishnan, V., and Weeratunga, S. K. (1991). The nas parallel benchmarks & summary and preliminary results. In ACM/IEEE SC, pages 158–165, USA. ACM.

Bartolini, A., Cacciari, M., Tilli, A., and Benini, L. (2012). Thermal and energy management of high-performance multicores: Distributed and self-calibrating modelpredictive controller. IEEE Transactions on Parallel and Distributed Systems, 24(1):170–183.

Berned, G., Rossi, F. D., Luizelli, M. C., de Souza, S. X., Beck, A. C. S., and Lorenzon, A. F. (2021). Low learning-cost offline strategies for EDP optimization of parallel applications. J. Syst. Archit., 114:101959.

Bhardwaj, S., Wang, W., Vattikonda, R., Cao, Y., and Vrudhula, S. (2006). Predictive modeling of the nbti effect for reliable design. In IEEE Custom Integrated Circuits Conference 2006, pages 189–192.

Breitbart, J., Pickartz, S., Lankes, S., Weidendorfer, J., and Monti, A. (2017). Dynamic co-scheduling driven by main memory bandwidth utilization. In 2017 IEEE International Conference on Cluster Computing (CLUSTER), pages 400–409. IEEE.

Breitbart, J., Weidendorfer, J., and Trinitis, C. (2015). Case study on co-scheduling for hpc applications. In 2015 44th International Conference on Parallel Processing Workshops, pages 277–285. IEEE.

Brooks, D. and Martonosi, M. (2001). Dynamic thermal management for highperformance microprocessors. In Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture, pages 171–182.

Che, S., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J. W., Lee, S.-H., and Skadron, K. (2009). Rodinia: A benchmark suite for heterogeneous computing. In IEEE Int. Symp. on Workload Characterization, pages 44–54, DC, USA. IEEE Computer Society.

Cho, M., Kersey, C., Gupta, M. P., Sathe, N., Kumar, S., Yalamanchili, S., and Mukhopadhyay, S. (2012). Power multiplexing for thermal field management in many-core processors. IEEE Transactions on Components, Packaging and Manufacturing Technology, 3(1):94–104.

Cho, Y., Guzman, C. A. C., and Egger, B. (2018). Maximizing system utilization via parallelism management for co-located parallel applications. In Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, pages 1–14.

Corbetta, S. and Fornaciari, W. (2012). Nbti mitigation in microprocessor designs. In Proceedings of the Great Lakes Symposium on VLSI, GLSVLSI ’12, page 33–38, New York, NY, USA. Association for Computing Machinery.

Creech, T., Kotha, A., and Barua, R. (2013a). Efficient multiprogramming for multicores with scaf. In 46th Annual IEEE/ACM Int. Symp. on Microarchitecture, MICRO-46, page 334–345, New York, NY, USA. ACM.

Creech, T., Kotha, A., and Barua, R. (2013b). Efficient multiprogramming for multicores with scaf. In 2013 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 334–345. IEEE.

da Silva, V. S., Nogueira, A. G., de Lima, E. C., de A. Rocha, H. M., Serpa, M. S., Luizelli, M. C., Rossi, F. D., Navaux, P. O., Beck, A. C. S., and Francisco Lorenzon, A. (2021). Smart resource allocation of concurrent execution of parallel applications. Concurrency and Computation: Practice and Experience, page e6600.

Harris, T., Maas, M., and Marathe, V. J. (2014). Callisto: Co-scheduling parallel runtime systems. In Proceedings of the Ninth European Conference on Computer Systems, pages 1–14.

Khdr, H., Amrouch, H., and Henkel, J. (2018). Aging-aware boosting. IEEE Transactions on Computers, 67(9):1217–1230.

Khdr, H., Ebi, T., Shafique, M., and Amrouch, H. (2014). mdtm: Multi-objective dynamic thermal management for on-chip systems. In 2014 Design, Automation & Test in Europe Conference & Exhibition (DATE), pages 1–6. IEEE.

Lee, H., Shafique, M., and Al Faruque, M. A. (2018). Aging-aware workload management on embedded gpu under process variation. IEEE Transactions on Computers, 67(7):920–933.

Lorenzon, A. F. and Beck Filho, A. C. S. (2019). Parallel computing hits the power wall: principles, challenges, and a survey of solutions. Springer Nature.

Medeiros, T. S., Berned, G. P., Navarro, A., Rossi, F. D., Luizelli, M. C., Brandalero, M., H¨ubner, M., Beck, A. C. S., and Lorenzon, A. F. (2021a). Aging-aware parallel execution. IEEE Embedded Systems Letters, 13(3):122–125.

Medeiros, T. S., Pereira, L., Rossi, F. D., Luizelli, M. C., Beck, A. C. S., and Lorenzon, A. F. (2019). Transparent aging-aware thread throttling. In 2019 31st International Symposium on Computer Architecture and High Performance Computing (SBACPAD), pages 1–8.

Medeiros, T. S., Pereira, L., Rossi, F. D., Luizelli, M. C., Beck, A. C. S., and Lorenzon, A. F. (2021b). Mitigating the processor aging through dynamic concurrency throttling. Journal of Parallel and Distributed Computing.

Oboril, F. and Tahoori, M. B. (2012). Extratime: Modeling and analysis of wearout due to transistor aging at microarchitecture-level. In IEEE/IFIP Int. Conf. on Dependable Systems and Networks, pages 1–12.

Pagani, S., Chen, J., Shafique, M., and Henkel, J. (2015). Matex: Efficient transient and peak temperature computation for compact thermal models. In DATE, pages 1515–1520.

Rahimi, A., Benini, L., and Gupta, R. K. (2013). Aging-aware compiler-directed vliw assignment for gpgpu architectures. In 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC), pages 1–6. IEEE.

Sasaki, H., Imamura, S., and Inoue, K. (2013). Coordinated power-performance optimization in manycores. In Proceedings of the 22nd international conference on Parallel architectures and compilation techniques, pages 51–61. IEEE.

Schroder, D. K. and Babcock, J. A. (2003). Negative bias temperature instability: Road to cross in deep submicron silicon semiconductor manufacturing. Journal of applied Physics, 94(1):1–18.

Sharifi, F., Rohbani, N., and Hessabi, S. (2020). Aging-aware context switching in multicore processors based on workload classification. IEEE Computer Architecture Letters, 19(2):159–162.

Stathis, J. H. and Zafar, S. (2006). The negative bias temperature instability in mos devices: A review. Microelectronics Reliability, 46(2-4):270–286.

Stratton, J. A., Rodrigues, C., Sung, I.-J., Obeid, N., Chang, L.-W., Anssari, N., Liu, G. D., and Hwu, W.-m. W. (2012). Parboil: A revised benchmark suite for scientific and commercial throughput computing. Center for Reliable and High-Performance Computing, 127.

Tousimojarad, A. and Vanderbauwhede, W. (2014). An efficient thread mapping strategy for multiprogramming on manycore processors. Parallel Computing: Accelerating Computational Science and Engineering (CSE), Advances in Parallel Computing, 25:63–71.

White, M. and Bernstein, J. B. (2008). Microelectronics reliability : physics-of-failure based modeling and lifetime evaluation. Technical Report JPL Publication 08-5 2/08, National Aeronautics and Space Administration, Jet Propulsion Laboratory, Pasadena, California.