Performance and Energy Consumption Evaluation of the VVC ALF Classification Step Across Different Programming Paradigms and Hardware Platforms
Resumo
The Adaptive Loop Filter (ALF) in the Versatile Video Coding (VVC) standard enhances visual quality but introduces a significant computational burden, posing a challenge for its implementation on resource-constrained devices. This paper presents a comprehensive evaluation of the ALF classification step, analyzing its execution time and energy consumption across multiple programming paradigms (scalar, SIMD, CUDA) on both a high-performance desktop and an embedded platform. Results show that on the high-performance desktop, SIMD optimization provides a substantial speedup over the scalar baseline with lower energy consumption, while the discrete GPU’s performance is limited by data transfer overhead. Conversely, the embedded system demonstrates a different landscape: its ARM CPU offers superior energy efficiency when compared to the desktop, and its integrated GPU’s runtime is dominated by kernel execution rather than data transfers. These findings underscore that the optimal ALF implementation is highly platform-dependent, hinging on the specific design trade-offs between processing speed and energy efficiency.
Referências
2023. VTM VVC Reference Software. [link]
Frank Bossen, Jill Boyce, Karsten Suehring, Xiang Li, and Vadim Seregin. 2020. JVET-T2010: VTM common test conditions and software reference configurations for SDR video. Doc. JVET.
Frank Bossen, Karsten Sühring, Adam Wieckowski, and Shan Liu. 2021. VVC Complexity and Software Implementation Analysis. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3765–3778. DOI: 10.1109/TCSVT.2021.3072204
Benjamin Bross, Ye-Kui Wang, Yan Ye, Shan Liu, Jianle Chen, Gary J. Sullivan, and Jens-Rainer Ohm. 2021. Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3736–3764. DOI: 10.1109/TCSVT.2021.3101953
Vitória Fabricio, Iago Storch, and Daniel Palomino. 2025. Processing Time Evaluation of the Classification Step in the Adaptive Loop Filter of VVC under Multiple Programming Paradigms. In 2025 IEEE 16th Latin America Symposium on Circuits and Systems (LASCAS), Vol. 1. 1–5. DOI: 10.1109/LASCAS64004.2025.10966364
Jiayue Fang and Fuzheng Yang. 2023. The Optimization of Adaptive Loop Filter Based on the Reduction of the Filter Sets. In 2023 International Conference on Ubiquitous Communication (Ucom). 166–170. DOI: 10.1109/Ucom59132.2023.10257583
Ibrahim Farhat, Wassim Hamidouche, Adrien Grill, Daniel Menard, and Olivier Deforges. 2022. Efficient HW Design of Adaptive Loop Filter for 4k ASIC VVC Encoder. In 2022 Picture Coding Symposium (PCS). 1–5. DOI: 10.1109/PCS56426.2022.10018078
Ibrahim Farhat, Wassim Hamidouche, Adrien Grill, Daniel Ménard, and Olivier Déforges. 2022. Adaptive Loop Filter Hardware Design for 4K ASIC VVC Decoders. IEEE Transactions on Consumer Electronics 68, 2 (2022), 107–118. DOI: 10.1109/TCE.2022.3146272
ITU-T. 2023. Recommendation H.266: Versatile Video Coding. Doc. ITU-T.
Sainathan Ganesh Iyer and Anurag Dipakumar Pawar. 2018. GPU and CPU accelerated mining of cryptocurrencies and their financial analysis. In 2018 2nd International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(ISMAC) I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC), 2018 2nd International Conference on. IEEE, 599–604.
Marta Karczewicz, Nan Hu, Jonathan Taquet, Ching-Yeh Chen, Kiran Misra, Kenneth Andersson, Peng Yin, Taoran Lu, Edouard François, and Jie Chen. 2021. VVC In-Loop Filters. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3907–3925. DOI: 10.1109/TCSVT.2021.3072297
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2017. Imagenet classification with deep convolutional neural networks. Commun. ACM 60, 6 (2017), 84–90.
NVIDIA. 2022. CUDA C++ Programming Guide. Doc. NVIDIA.
Anup Saha, Wassim Hamidouche, Miguel Chavarr’ıas, Fernando Pescador, and Ibrahim Farhat. 2023. Performance analysis of optimized versatile video coding software decoders on embedded platforms. Journal of Real-Time Image Processing 20, 6 (2023), 120.
Anup Saha, Nuno Roma, Miguel Chavarr’ıas, Tiago Dias, Fernando Pescador, and V’ıctor Aranda. 2023. GPU-based parallelisation of a versatile video coding adaptive loop filter in resource-constrained heterogeneous embedded platform. Journal of Real-Time Image Processing 20, 3 (2023), 43.
Iago Storch, Daniel Palomino, and Sergio Bampi. 2022. GPU-Acceleration of Affine Prediction in the Versatile Video Coding. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS). 429–433. DOI: 10.1109/ISCAS48785.2022.9937704
Iago Storch, Nuno Roma, Daniel Palomino, and Sergio Bampi. 2023. GPU Acceleration of MIP Intra Prediction in VVC. In 2023 31st European Signal Processing Conference (EUSIPCO). 600–604. DOI: 10.23919/EUSIPCO58844.2023.10290037
Iago Storch, Nuno Roma, Daniel Palomino, and Sergio Bampi. 2025. Improving Coding Efficiency of Massive Parallel Intra Prediction Using Alternative References. IEEE Transactions on Circuits and Systems I: Regular Papers (2025), 1–14. DOI: 10.1109/TCSI.2025.3564455
Chao Yang, Wei Xue, Haohuan Fu, Lin Gan, Linfeng Li, Yangtong Xu, Yutong Lu, Jiachang Sun, Guangwen Yang, and Weimin Zheng. 2013. A peta-scalable CPU-GPU algorithm for global atmospheric simulations. ACM SIGPLAN Notices 48, 8 (2013), 1–12.
Wenbin Yin, Kai Zhang, and Li Zhang. 2023. Extended Adaptive Loop Filter Beyond VVC. In 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP). 1–5. DOI: 10.1109/VCIP59821.2023.10402795
