Avaliação Energética Sensível ao Regime de Execução em GPU: PyTorch e Pré-processamento Externo em Go/Rust

Murilo Salem; Daniel Pontes; Luísa Bohm; Henrique dos Reis; Gerson Geraldo H. Cavalheiro

doi:10.5753/eradrs.2026.21457

Murilo Salem UFPel
Daniel Pontes UFPel
Luísa Bohm UFPel
Henrique dos Reis UFPel
Gerson Geraldo H. Cavalheiro UFPel

DOI: https://doi.org/10.5753/eradrs.2026.21457

Resumo

O custo energético crescente de cargas de IA/ML em GPUs exige benchmarks capazes de medir energia de ponta a ponta, e não apenas throughput. Apresentamos o GC-Bench, um harness orientado por configuração que expande planos YAML em experimentos reprodutíveis, amostra potência da GPU via NVML com fallback para nvidia-smi e gera relatórios com base estatística. Avaliamos dois caminhos de execução sobre um mesmo backbone TinyLM em PyTorch: um em PyTorch puro e outro com tokenização externa em Go/Rust. Em 1.018 execuções validadas, o treino apresentou quase empate, enquanto a inferência favoreceu o PyTorch puro, que reduziu o J/token da GPU em 6,73%. Em H2D, memória pinned elevou a largura de banda em 2,82× e reduziu J/GB em 19%. O GC-Bench é disponibilizado como artefato aberto para apoiar estudos reprodutíveis de energia.

Referências

David, H., Gorbatov, E., Hanebutte, U. R., Khanna, R., and Le, C. (2010). Rapl: Memory power estimation and capping. In Proceedings of the 16th ACM/IEEE International Symposium on Low Power Electronics and Design, pages 189–194.

Mattson, P., Cheng, C., Diamos, G., Coleman, C., Micikevicius, P., Patterson, D., Tang, H., Wei, G., Bailis, P., Bittorf, V., Brooks, D., Chen, D., Dutta, D., Gupta, U., Hazelwood, K., Hock, A., Huang, X., Kang, D., Kanter, D., Kumar, N., Liao, J., Narayanan, D., Oguntebi, T., Pekhimenko, G., Pentecost, L., Reddi, V. J., Robie, T., John, T. S., Wu, C., Xu, L., Young, C., and Zaharia, M. (2020). Mlperf training benchmark. In Proceedings of Machine Learning and Systems (MLSys), volume 2.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, volume 32.

Patterson, D., Gonzalez, J., Le, Q. V., Liang, C., Munguia, L., Rothchild, D., So, D. R., Texier, M., and Dean, J. (2021). Carbon emissions and large neural network training. CoRR, abs/2104.10350.