A Framework for Analytical Performance and Energy Prediction of DL Training on GPUs
Resumo
The rapid scaling of deep learning (DL) models raises the need for accurate and understandable performance/energy prediction tools to support efficient resource management and sustainable AI development. Existing modeling approaches often lack both sufficient granularity to capture nuanced hardware-software interactions and suitable flexibility to adapt to diverse modern architectures. This paper introduces an analytical framework for time/energy prediction of DL training workloads on GPU. Our framework integrates detailed workload characterization that includes FLOPs, memory access, kernel activities, and novel structural features to derive an architecture-aware efficiency model, which considers a saturation-based function to capture dimensional scaling effects on hardware utilization. We propose an iterative refinement methodology, which incorporates model-specific scalars to address particular architectures like ALBERT and precision-specific calibrations for BF16 operations. Our benchmark with six advanced DL models (including CNNs, BERT-style Transformers, and LLMs like TinyLlama) on NVIDIA A100 GPUs under various configurations (1/4 GPUs, FP32/TF32/mixed BF16) shows that our approach achieves a high predictive accuracy, with an overall relative error of 4.14% (3.05% for time, 5.78% for power). The framework is intended to provide valuable insights for HPC-AI co-design, energy-aware scheduling, and performance optimization.
Palavras-chave:
Training, Deep learning, Analytical models, Adaptation models, Accuracy, Computational modeling, Predictive models, Hardware, Artificial intelligence, Optimization, runtime prediction, power modeling, energy efficiency, analytical framework, HPC-AI, DL training, GPUs modeling
Publicado
28/10/2025
Como Citar
TCHAKOUTE, Roblex Nana; TADONKI, Claude; DOKLADAL, Petr; MESRI, Youssef.
A Framework for Analytical Performance and Energy Prediction of DL Training on GPUs. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 37. , 2025, Bonito/MS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 215-226.
