Generative Adversarial Network and ResNet Comparison for Video Super Resolution in Smartphones
Resumo
With the latest evolutions of smartphone hardware regarding neural network processing and advances in Image Super Resolution, one question remains: What is the applicability of SR techniques already consolidated in smartphones? Several works focused on different characteristics of images have been developed and seek to understand how such networks can meet different segments. This work focuses on applying two already consolidated networks in the processing of photo-realistic images, SRResNet and SRGAN, in mobile devices to identify how such networks behave in smartphones of different generations, either in the image quality or in the response time of these networks. After evaluation, the SRResNet network had a better performance both in inferred image quality and lower latency, with a PSNR of 27.7075 versus 21.3843 and 0.19 milliseconds latency, compared to 0.20 milliseconds for SRGAN, thus showing that it is feasible to apply SR techniques already consolidated in smartphones.
Referências
A. C. Bovik, Handbook of image and video processing. Academic press, 2010.
Y. Zhang, Y. Zhang, Y. Wu, Y. Tao, K. Bian, P. Zhou, L. Song, and H. Tuo, “Improving quality of experience by adaptive video streaming with super-resolution,” in IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 1957–1966, IEEE, 2020.
Z. Wang, J. Chen, and S. C. Hoi, “Deep learning for image super-resolution: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 43, no. 10, pp. 3365–3387, 2020.
L. Yue, H. Shen, J. Li, Q. Yuan, H. Zhang, and L. Zhang, “Image super-resolution: The techniques, applications, and future,” Signal processing, vol. 128, pp. 389–408, 2016.
C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681–4690, 2017.
A. Ignatov, A. Romero, H. Kim, and R. Timofte, “Real-time video super-resolution on smartphones with deep learning, mobile ai 2021 challenge: Report,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2535–2544, 2021.
A. Ignatov, L. Van Gool, and R. Timofte, “Replacing mobile camera isp with a single deep learning model,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 536–537, 2020.
T. T. Dong, H. Yan, M. Parasar, and R. Krisch, “Rendersr: A lightweight super-resolution model for mobile gaming upscaling,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3087–3095, 2022.
T. Xu, Z. Jia, Y. Zhang, L. Bao, and H. Sun, “Elsr: Extreme low-power super resolution network for mobile devices,” arXiv preprint arXiv:2208.14600, 2022.
S. Nah, S. Baik, S. Hong, G. Moon, S. Son, R. Timofte, and K. Mu Lee, “Ntire 2019 challenge on video deblurring and super-resolution: Dataset and study,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0, 2019.
S. Son, S. Lee, S. Nah, R. Timofte, and K. M. Lee, “Ntire 2021 challenge on video super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 166–181, 2021.
S. Nah, R. Timofte, S. Gu, S. Baik, S. Hong, G. Moon, S. Son, and K. Mu Lee, “Ntire 2019 challenge on video super-resolution: Methods and results,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0–0, 2019.
A. Ignatov, R. Timofte, C.-M. Chiang, H.-K. Kuo, Y.-S. Xu, M.-Y. Lee, A. Lu, C.-M. Cheng, C.-C. Chen, J.-Y. Yong, et al., “Power efficient video super-resolution on mobile npus with deep learning, mobile ai & aim 2022 challenge: report,” in European Conference on Computer Vision, pp. 130–152, Springer, 2022.