Aumento da Eficiência Energética do Método DFT Através da Redução do Tempo de Cálculo Utilizando GPU
Resumo
A Teoria do Funcional da Densidade (DFT) é um dos métodos mais populares e versáteis disponíveis na Física de Matéria Condensada, e na Física e Química Computacionais. É a base da maioria dos sistemas de simulação de materiais. Entretanto, devido à complexidade dos cálculos envolvidos, a DFT gera uma alta demanda de poder computacional. A grande quantidade de tempo necessária para realizar as computações envolvidas aumenta as necessidades de energia, o que, hoje em dia, constitui uma das maiores preocupações sob o ponto de vista ambiental. Neste trabalho, propomos uma solução eficiente e consciente do uso de energia que aproveita a grande capacidade aritmética das placas gráficas (ou GPUs) modernas. Nossa implementação em GPU alcançou uma aceleração significativa sobre uma implementação CPU tradicional, gastando 20 vezes menos energia.Referências
Born, M. and Oppenheimer, R. (1927). On the quantum theory of molecules. Annalen der Physik, 84:457.
Chakraborty, K., Wells, P. M., Sohi, G. S., and Chakraborty, K. (2007). A case for an over-provisioned multicore system: Energy efficient processing of multithreaded programs. Technical report.
Genovese, L., Ospici, M., Deutsch, T., Méhaut, J., Neelov, A., and Goedecker, S. (2009). Density Functional Theory calculation on many-cores hybrid CPU-GPU architectures. arXiv, 904.
Hohenberg, P. and Kohn, W. (1964). Inhomogeneous electron gas. Phys. Rev, 136(3B):B864–B871.
Huang, S., Xiao, S., and Feng, W. (2009). On the energy efficiency of graphics processing units for scientific computing. In IPDPS ’09: Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing, pages 1–8.
Jacobsen, C. J. H., Dahl, S., Boisen, A., Clausen, B. S., Topsøe, H., Logadottir, A., and Nørskov, J. K. (2002). Optimal catalyst curves: Connecting density functional theory calculations with industrial reactor design and catalyst selection. Journal of Catalysis, 205(2):382 – 387.
Jiao, Y. and Hurson, A. R. (2007). Energy-efficient wireless information retrieval. J. Comput. Syst. Sci., 73(8):1145–1163.
Kansal, A. and Zhao, F. (2008). Fine-grained energy profiling for power-aware application design. SIGMETRICS Perform. Eval. Rev., 36(2):26–31.
Kleinman, L. and Bylander, D. (1982). Efficacious form for model pseudopotentials. Physical Review Letters, 48(20):1425–1428.
Kohn, W., Sham, L., et al. (1965). Self-consistent equations including exchange and correlation effects. Phys. Rev, 140(4A):A1133–A1138.
Moradian, R., Behzad, S., and Azadi, S. (2008). Ab initio density functional theory investigation of electronic properties of semiconducting single-walled carbon nanotube bundles. Physica E: Low-dimensional Systems and Nanostructures, 40(10):3055 – 3059.
NVIDIA (2007). CUDA CUFFT Library. nVidia Corporation.
Ordejón, P., Artacho, E., and Soler, J. (1996). Self-consistent order-N density-functional calculations for very large systems. Physical Review B, 53(16):10441–10444.
Qin, W., Li, X., Bian, W.-W., Fan, X.-J., and Qi, J.-Y. (2010). Density functional theory calculations and molecular dynamics simulations of the adsorption of biomolecules on graphene surfaces. Biomaterials, 31(5):1007 – 1016.
Ramani, K., Ibrahim, A., and Shimizu, D. (2006). Powerred: A flexible modeling framework for power efficiency exploration in gpus. In Proceedings of the Workshop on General Purpose Processing on GPUs, GPGPU’07.
Raybaud, P., Hafner, J., Kresse, G., Kasztelan, S., and Toulhoat, H. (2000). Structure, energetics, and electronic properties of the surface of a promoted mos2 catalyst: An ab initio local density functional study. Journal of Catalysis, 190(1):128 – 143.
Reinman, G., Calder, B., and Austin, T. M. (2002). High performance and energy efficient serial prefetch architecture. In ISHPC ’02: Proceedings of the 4th International Symposium on High Performance Computing, pages 146–159.
Rivoire, S., Shah, M. A., Ranganathan, P., and Kozyrakis, C. (2007). Joulesort: a balanced energy-efficiency benchmark. In SIGMOD ’07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 365–376.
Rofouei, M., Stathopoulos, T., Ryffel, S., Kaiser, W., and Sarrafzadeh, M. (2008). Energy-Aware High Performance Computing with Graphic Processing Units. In Workshop on Power Aware Computing and System.
Schrödinger, E. (1926). An undulatory theory of the mechanics of atoms and molecules. Physical Review, 28:1049–1070.
Sheaffer, J. W., Skadron, K., and Luebke, D. P. (2005). Studying thermal management for graphics-processor architectures. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005, pages 54–65.
Soler, J., Artacho, E., Gale, J., García, A., Junquera, J., Ordejón, P., and Sánchez-Portal, D. (2002). The SIESTA method for ab initio order-N materials simulation. Journal of Physics: Condensed Matter, 14:2745–2779.
Sun, W. and Ma, Z. (2009). Count Sort for GPU Computing. In 2009 15th International Conference on Parallel and Distributed Systems, pages 919–924. IEEE.
Takizawa, H., Sato, K., and Kobayashi, H. (2008). Sprat: Runtime processor selection for energy-aware computing. In IEEE Cluster, pages 386–393. IEEE Computer Society.
Temperton, C. (1992). A Generalized Prime Factor FFT Algorithm for any N=2p3q5r. SIAM Journal on Scientific and Statistical Computing, 13:676.
Trapnell, C. and Schatz, M. (2009). Optimizing data intensive GPGPU computations for DNA sequence alignment. Parallel Computing.
Vahdat, A., Lebeck, A., and Ellis, C. S. (2000). Every joule is precious: the case for revisiting operating system design for energy efficiency. In EW 9: Proceedings of the 9th workshop on ACM SIGOPS European workshop, pages 31–36.
Yang, J., Wang, Y., and Chen, Y. (2007). GPU accelerated molecular dynamics simulation of thermal conductivities. Journal of Computational Physics, 221(2):799–804.
Yang, S., Adjaye, J., McCaffrey, W. C., and Nelson, A. E. (2010). Density-functional theory (dft) study of arsenic poisoning of nimos. Journal of Molecular Catalysis A: Chemical, 321(1-2):83 – 91.
Yasuda, K. (2008). Accelerating density functional calculations with graphics processing unit. Journal of Chemical Theory and Computation, 4(8):1230–1236.
Chakraborty, K., Wells, P. M., Sohi, G. S., and Chakraborty, K. (2007). A case for an over-provisioned multicore system: Energy efficient processing of multithreaded programs. Technical report.
Genovese, L., Ospici, M., Deutsch, T., Méhaut, J., Neelov, A., and Goedecker, S. (2009). Density Functional Theory calculation on many-cores hybrid CPU-GPU architectures. arXiv, 904.
Hohenberg, P. and Kohn, W. (1964). Inhomogeneous electron gas. Phys. Rev, 136(3B):B864–B871.
Huang, S., Xiao, S., and Feng, W. (2009). On the energy efficiency of graphics processing units for scientific computing. In IPDPS ’09: Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing, pages 1–8.
Jacobsen, C. J. H., Dahl, S., Boisen, A., Clausen, B. S., Topsøe, H., Logadottir, A., and Nørskov, J. K. (2002). Optimal catalyst curves: Connecting density functional theory calculations with industrial reactor design and catalyst selection. Journal of Catalysis, 205(2):382 – 387.
Jiao, Y. and Hurson, A. R. (2007). Energy-efficient wireless information retrieval. J. Comput. Syst. Sci., 73(8):1145–1163.
Kansal, A. and Zhao, F. (2008). Fine-grained energy profiling for power-aware application design. SIGMETRICS Perform. Eval. Rev., 36(2):26–31.
Kleinman, L. and Bylander, D. (1982). Efficacious form for model pseudopotentials. Physical Review Letters, 48(20):1425–1428.
Kohn, W., Sham, L., et al. (1965). Self-consistent equations including exchange and correlation effects. Phys. Rev, 140(4A):A1133–A1138.
Moradian, R., Behzad, S., and Azadi, S. (2008). Ab initio density functional theory investigation of electronic properties of semiconducting single-walled carbon nanotube bundles. Physica E: Low-dimensional Systems and Nanostructures, 40(10):3055 – 3059.
NVIDIA (2007). CUDA CUFFT Library. nVidia Corporation.
Ordejón, P., Artacho, E., and Soler, J. (1996). Self-consistent order-N density-functional calculations for very large systems. Physical Review B, 53(16):10441–10444.
Qin, W., Li, X., Bian, W.-W., Fan, X.-J., and Qi, J.-Y. (2010). Density functional theory calculations and molecular dynamics simulations of the adsorption of biomolecules on graphene surfaces. Biomaterials, 31(5):1007 – 1016.
Ramani, K., Ibrahim, A., and Shimizu, D. (2006). Powerred: A flexible modeling framework for power efficiency exploration in gpus. In Proceedings of the Workshop on General Purpose Processing on GPUs, GPGPU’07.
Raybaud, P., Hafner, J., Kresse, G., Kasztelan, S., and Toulhoat, H. (2000). Structure, energetics, and electronic properties of the surface of a promoted mos2 catalyst: An ab initio local density functional study. Journal of Catalysis, 190(1):128 – 143.
Reinman, G., Calder, B., and Austin, T. M. (2002). High performance and energy efficient serial prefetch architecture. In ISHPC ’02: Proceedings of the 4th International Symposium on High Performance Computing, pages 146–159.
Rivoire, S., Shah, M. A., Ranganathan, P., and Kozyrakis, C. (2007). Joulesort: a balanced energy-efficiency benchmark. In SIGMOD ’07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 365–376.
Rofouei, M., Stathopoulos, T., Ryffel, S., Kaiser, W., and Sarrafzadeh, M. (2008). Energy-Aware High Performance Computing with Graphic Processing Units. In Workshop on Power Aware Computing and System.
Schrödinger, E. (1926). An undulatory theory of the mechanics of atoms and molecules. Physical Review, 28:1049–1070.
Sheaffer, J. W., Skadron, K., and Luebke, D. P. (2005). Studying thermal management for graphics-processor architectures. In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005, pages 54–65.
Soler, J., Artacho, E., Gale, J., García, A., Junquera, J., Ordejón, P., and Sánchez-Portal, D. (2002). The SIESTA method for ab initio order-N materials simulation. Journal of Physics: Condensed Matter, 14:2745–2779.
Sun, W. and Ma, Z. (2009). Count Sort for GPU Computing. In 2009 15th International Conference on Parallel and Distributed Systems, pages 919–924. IEEE.
Takizawa, H., Sato, K., and Kobayashi, H. (2008). Sprat: Runtime processor selection for energy-aware computing. In IEEE Cluster, pages 386–393. IEEE Computer Society.
Temperton, C. (1992). A Generalized Prime Factor FFT Algorithm for any N=2p3q5r. SIAM Journal on Scientific and Statistical Computing, 13:676.
Trapnell, C. and Schatz, M. (2009). Optimizing data intensive GPGPU computations for DNA sequence alignment. Parallel Computing.
Vahdat, A., Lebeck, A., and Ellis, C. S. (2000). Every joule is precious: the case for revisiting operating system design for energy efficiency. In EW 9: Proceedings of the 9th workshop on ACM SIGOPS European workshop, pages 31–36.
Yang, J., Wang, Y., and Chen, Y. (2007). GPU accelerated molecular dynamics simulation of thermal conductivities. Journal of Computational Physics, 221(2):799–804.
Yang, S., Adjaye, J., McCaffrey, W. C., and Nelson, A. E. (2010). Density-functional theory (dft) study of arsenic poisoning of nimos. Journal of Molecular Catalysis A: Chemical, 321(1-2):83 – 91.
Yasuda, K. (2008). Accelerating density functional calculations with graphics processing unit. Journal of Chemical Theory and Computation, 4(8):1230–1236.
Publicado
20/07/2010
Como Citar
SILVA, C. P.; CUPERTINO, L. F.; CHEVITARESE, D. S.; PACHECO, M. A. C.; BENTES, C..
Aumento da Eficiência Energética do Método DFT Através da Redução do Tempo de Cálculo Utilizando GPU. In: WORKSHOP EM DESEMPENHO DE SISTEMAS COMPUTACIONAIS E DE COMUNICAÇÃO (WPERFORMANCE), 9. , 2010, Belo Horizonte/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2010
.
p. 1790-1803.
ISSN 2595-6167.