Energy-Aware Deep Learning on GPUs through Parameter Sharing and Mixed Precision Training
Resumo
The design of Deep Learning models increasingly relies on advanced techniques such as parameter sharing and mixed precision training to handle computational and memory costs. Although effective in theory, their practical impact on system-level performance, energy consumption, and memory subsystem behavior is complex and interdependent. This paper presents a performance/energy trade-off analysis of the aforementioned techniques through an empirical case study. We benchmark a diverse suite of six models, including a direct comparison of DistilBERT (conventional) and ALBERT (parameter-sharing) on a multi-GPU NVIDIA A100 platform. Our analysis across FP32, TF32, and mixed BF16 precisions reveals two key findings. First, contrary to its smaller parameter count, ALBERT is empirically up to 2.2× slower and triggers up to ~ 3× higher GPU memory footprint than DistilBERT; an effect that we attribute to its deep conceptual unrolling that led to many memory activations and thereby incurs an important overhead. Second, while mixed BF16 precision provides an average training speed-up of ~ 2.1×, the benefits are strongly model-dependent. Using an empirical Throughput-per-Watt (Samples/Joule) efficiency metric, we show that compute-bound models like TinyLlama are more energy efficient, whereas CNNs show marginal improvements, which we link to implicit TF32 acceleration in their FP32 baseline via the cuDNN library.
Palavras-chave:
Training, Deep learning, Measurement, Runtime, Power demand, Computational modeling, Memory management, Graphics processing units, Benchmark testing, Energy efficiency, runtime analysis, power consumption, energy efficiency, benchmarking, mixed precision, deep learning training, GPU modeling
Publicado
28/10/2025
Como Citar
TCHAKOUTE, Roblex Nana; TADONKI, Claude.
Energy-Aware Deep Learning on GPUs through Parameter Sharing and Mixed Precision Training. In: WORKSHOP ON LIGHTWEIGHT EFFICIENT DEEP LEARNING IN HPC ENVIRONMENTS (LEANDL-HPC) - INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 37. , 2025, Bonito/MS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 116-123.
