Fast ISP Mode Decision for the Versatile Video Coding Intra Prediction Using Machine Learning
Resumo
The Versatile Video Coding (VVC) standard achieves high compression rates by introducing new encoding tools, such as the Intra Subpartition Prediction (ISP). However, the ISP increases the computational effort necessary to perform the mode decision of the intra prediction step. This paper proposes a fast intra-mode decision solution for the ISP using machine learning. A Decision Tree is employed to predict the most promising ISP modes to be optimal to avoid the costly RDO test of ISP modes that are less likely to be chosen. By reducing the number of modes fully evaluated by the RDO process, the proposed solution achieves an average time-saving of 3.15% with only 0.11% of coding efficiency loss when tested for the common test conditions of VVC. Unlike the related works, our solution avoids the time overhead of calculating image features by adopting features from the encoding process. Compared with related works, our solution presents competitive time-saving and coding efficiency results.
Referências
Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. [link]
Frank Bossen, Jill Boyce, Karsten Sühring, Xiang Li, and Vadim Seregin. 2020. VTM common test conditions and software reference configurations for SDR video. [link]. Acessado: 22-08-2024.
Frank Bossen, Karsten Suehring, and Xiang Li. 2018. VTM reference software for VVC. [link] Acessado: 22-08-2024.
Benjamin Bross, Ye-Kui Wang, Yan Ye, Shan Liu, Jianle Chen, Gary J. Sullivan, and Jens-Rainer Ohm. 2021. Overview of the Versatile Video Coding (VVC) Standard and its Applications. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3736–3764. DOI: 10.1109/TCSVT.2021.3101953
L. Ceci. 2023. Live streaming - Statistics & Facts. [link] Acessado: 20-06-2023.
Yao-Jen Chang, Hong-Jheng Jhu, Hui-Yu Jiang, Liang Zhao, Xin Zhao, Xiang Li, Shan Liu, Benjamin Bross, Paul Keydel, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2019. Multiple Reference Line Coding for Most Probable Modes in Intra Prediction. In 2019 Data Compression Conference (DCC). IEEE, Snowbird, UT, USA, 559–559. DOI: 10.1109/DCC.2019.00071
Santiago De-Luxán-Hernández, Valeri George, Jackie Ma, Tung Nguyen, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2019. An Intra Subpartition Coding Mode for VVC. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, Taipei, Taiwan, 1203–1207. DOI: 10.1109/ICIP.2019.8803777
Adson Duarte, Bruno Zatt, Guilherme Correa, and Daniel Palomino. 2023. Fast Intra Mode Decision Using Machine Learning for the Versatile Video Coding Standard. In 2023 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, Monterey, CA, USA, 1–5. DOI: 10.1109/ISCAS46773.2023.10181769
International Telecommunication Union (ITU). 2023. Subjective video quality assessment methods for multimedia applications. Retrieved Aug. 22, 2024 from [link]
Zhi Liu, Mengjun Dong, Xiao Guan, Mengmeng Zhang, and Ruoyu Wang. 2021. Fast ISP coding mode optimization algorithm based on CU texture complexity for VVC. EURASIP Journal on Image and Video Processing 2021 (07 2021). DOI: 10.1186/s13640-021-00564-4
Alexandre Mercat, Arttu Mäkinen, Joose Sainio, Ari Lemmetti, Marko Viitanen, and Jarno Vanne. 2021. Comparative Rate-Distortion-Complexity Analysis of VVC and HEVC Video Codecs. IEEE Access 9 (2021), 67813–67828. DOI: 10.1109/ACCESS.2021.3077116
Jeeyoon Park, Bumyoon Kim, and Byeungwoo Jeon. 2020. Fast VVC intra prediction mode decision based on block shapes. In Applications of Digital Image Processing XLIII, Andrew G. Tescher and Touradj Ebrahimi (Eds.), Vol. 11510. International Society for Optics and Photonics, SPIE, Basel, Switzerland, 115102H. DOI: 10.1117/12.2567919
Jeeyoon Park, Bumyoon Kim, Jeehwan Lee, and Byeungwoo Jeon. 2022. Machine Learning-Based Early Skip Decision for Intra Subpartition Prediction in VVC. IEEE Access 10 (2022), 111052–111065. DOI: 10.1109/ACCESS.2022.3215163
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Édouard Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 85 (2011), 2825–2830. [link]
Jonathan Pfaff, Alexey Filippov, Shan Liu, Xin Zhao, Jianle Chen, Santiago De-Luxán-Hernández, Thomas Wiegand, Vasily Rufitskiy, Adarsh Krishnan Rama-subramonian, and Geert Van der Auwera. 2021. Intra Prediction and Mode Coding in VVC. IEEE Transactions on Circuits and Systems for Video Technology 31, 10 (2021), 3834–3847. DOI: 10.1109/TCSVT.2021.3072430
Mário Saldanha, Gustavo Sanchez, César Marcon, and Luciano Agostini. 2021. Learning-Based Complexity Reduction Scheme for VVC Intra-Frame Prediction. In 2021 International Conference on Visual Communications and Image Processing (VCIP). IEEE, Munich, Germany, 1–5. DOI: 10.1109/VCIP53242.2021.9675394
Michael Schäfer, Björn Stallenberger, Jonathan Pfaff, Philipp Helle, Heiko Schwarz, Detlev Marpe, and Thomas Wiegand. 2019. An Affine-Linear Intra Prediction With Complexity Constraints. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, Taipei, Taiwan, 1089–1093. DOI: 10.1109/ICIP.2019.8803724
Ícaro Siqueira, Guilherme Correa, and Mateus Grellert. 2020. Rate-distortion and complexity comparison of HEVC and VVC video encoders. In 2020 IEEE 11th Latin American Symposium on Circuits & Systems (LASCAS). IEEE, San Jose, Costa Rica, 1–4.
G.J. Sullivan and T. Wiegand. 1998. Rate-distortion optimization for video compression. IEEE Signal Processing Magazine 15, 6 (November 1998), 74–90. DOI: 10.1109/79.733497
Liang Zhao, Li Zhang, Siwei Ma, and Debin Zhao. 2011. Fast mode decision algorithm for intra prediction in HEVC. In 2011 Visual Communications and Image Processing (VCIP). IEEE, Tainan, Taiwan, 1–4. DOI: 10.1109/VCIP.2011.6115979