Predição Intra-Quadro Baseada em Aprendizado Profundo para Light Fields Densos
Resumo
This study proposes a new strategy for intra prediction of dense light fields by reinterpreting the problem as image inpaiting and using convolutional neural networks. Multiple architectures and training techniques were evaluated in order to identify the most efficient configuration for performing intra prediction in a video encoder for this type of data. Separate networks were trained for each of the 3 block sizes of the encoder and their performance evaluated separately and together. The results showed that the use of convolutional neural networks as intra predictors significantly improves coding efficiency in the EVC encoder, achieving an average BD-rate reduction of -30.53%.
Referências
Kiho Choi, Jianle Chen, Dmytro Rusanovskyy, Kwang-Pyo Choi, and Euee S Jang. 2020. An overview of the MPEG-5 essential video coding standard [standards in a nutshell]. IEEE Signal Processing Magazine 37, 3 (2020), 160–167.
Thierry Dumas, Aline Roumy, and Christine Guillemot. 2019. Context-adaptive neural network-based prediction for image compression. IEEE Transactions on Image Processing 29 (2019), 679–693.
Christopher Hahne and Amar Aggoun. 2021. PlenoptiCam v1.0: A Light-Field Imaging Framework. IEEE Transactions on Image Processing 30 (2021), 6757–6771. DOI: 10.1109/TIP.2021.3095671
Junhui Hou, Jie Chen, and Lap-Pui Chau. 2018. Light field image compression based on bi-level view compensation with rate-distortion optimization. IEEE Transactions on Circuits and Systems for Video Technology 29, 2 (2018), 517–530.
Li Li, Zhu Li, Bin Li, Dong Liu, and Houqiang Li. 2017. Pseudo-sequence-based 2-D hierarchical coding structure for light-field image compression. IEEE Journal of Selected Topics in Signal Processing 11, 7 (2017), 1107–1119.
Dong Liu, Lizhi Wang, Li Li, Zhiwei Xiong, Feng Wu, and Wenjun Zeng. 2016. Pseudo-sequence-based light field image compression. In 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 1–4.
Faguo Liu, Qian Zhang, Tao Yan, Bin Wang, Ying Gao, Jiaqi Hou, and Feiniu Yuan. 2024. Light field image coding using a residual channel attention network–based view synthesis. Data Technologies and Applications (2024).
Guilin Liu, Fitsum A Reda, Kevin J Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro. 2018. Image inpainting for irregular holes using partial convolutions. In Proceedings of the European conference on computer vision (ECCV). 85–100.
Martin Rerabek and Touradj Ebrahimi. 2016. New light field image dataset. In 8th International Conference on Quality of Multimedia Experience (QoMEX).
Jonatan Samuelsson, Kiho Choi, Jianle Chen, and Dmytro Rusanovskyy. 2019. Mpeg-5 evc. In SMPTE 2019. SMPTE, 1–11.
Gabriele Spadaro, Roberto Iacoviello, Alessandra Mosca, Giuseppe Valenzise, and Attilio Fiandrotti. 2023. A Learnable EVC Intra Predictor Using Masked Convolutions. In International Conference on Image Analysis and Processing. Springer, 537–549.
Soheib Takhtardeshir, Roger Olsson, Christine Guillemot, and Mårten Sjöström. 2024. A Deep Learning based Light Field Image Compression as Pseudo Video Sequences with Additional in-loop Filtering. Electronic Imaging 36 (2024), 1–6.
Michael W Tao, Sunil Hadap, Jitendra Malik, and Ravi Ramamoorthi. 2013. Depth from combining defocus and correspondence using light-field cameras. In Proceedings of the IEEE International Conference on Computer Vision. 673–680.
Tingting Zhong, Xin Jin, Lingjun Li, and Qionghai Dai. 2019. Light field image compression using depth-based CNN in intra prediction. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8564–8567.