Evaluating Loss Functions for Illustration Super-Resolution Neural Networks
Resumo
As display technologies evolve and high-resolution screens become more available, the desirability of images and videos with high perceptual quality grows in order to properly utilize such advances. At the same time, the market for illustrated mediums, such animations and comics, has been in steady growth over the past years. Based on these observations, we were motivated to explore the super-resolution task in the niche of drawings. In absence of original high-resolution imagery, it is necessary to use approximate methods, such as interpolation algorithms, to enhance low-resolution media. Such methods, however, can produce undesirable artifacts in the reconstruct images, such as blurring and edge distortions. Recent works have successfully applied deep learning to this task, but such efforts are often aimed at real-world images and do not take in account the specifics of illustrations, which emphasize lines and employ simplified patterns rather than complex textures, which in turn makes visual artifacts introduced by algorithms easier to spot. With these differences in mind, we evaluated the effects of the choice of loss functions in order to obtain accurate and perceptually pleasing results in the super-resolution task for comics, cartoons, and other illustrations. Experimental evaluations have shown that a loss function based on edge detection performs best in this context among the evaluated functions, though still showing room for further improvements.Referências
Statista, “Global 4k uhd tv unit sales from 2014 to 2019,” 2019. [Online]. Available: https://www.statista.com/statistics/540680/global-4k-tv-unit-sales/
R. E. W. Rafael C. Gonzalez, Digital Image Processing, 4th Edition. Pearson, 2018.
bloc97, “SYNLA Dataset — image super-resolution for anime-style art.” [Online]. Available: https://github.com/bloc97/SYNLA-Dataset
C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” 2015.
W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-time single image and video superresolution using an efficient sub-pixel convolutional neural network,” 2016.
W. Yang, X. Zhang, Y. Tian, W. Wang, J.-H. Xue, and Q. Liao, “Deep learning for single image super-resolution: A brief review,” IEEE Transactions on Multimedia, vol. 21, no. 12, p. 3106–3121, Dec 2019. [Online]. Available: http://dx.doi.org/10.1109/TMM.2019.2919431
J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” in 2008 IEEE conference on computer vision and pattern recognition. IEEE, 2008, pp. 1–8.
J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE transactions on image processing, vol. 19, no. 11, pp. 2861–2873, 2010.
C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photo-realistic single image super-resolution using a generative adversarial network,” 2017.
B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks for single image super-resolution,” 2017.
Anonymous, Danbooru community, and G. Branwen, “Danbooru2020: A large-scale crowdsourced and tagged anime illustration dataset,” https://www.gwern.net/Danbooru2020, January 2021. [Online]. Available: https://www.gwern.net/Danbooru2020
Tyler, “Dandere2x— fast waifu2x video upscaling.” [Online]. Available: https://github.com/akai-katto/dandere2x
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural computation, vol. 1, no. 4, pp. 541–551, 1989.
Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2472–2481.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.
M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1664–1673.
H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on computational imaging, vol. 3, no. 1, pp. 47–57, 2016.
Z. Lu and Y. Chen, “Single image super resolution based on a modified u-net with mixed gradient loss,” 2019.
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015.
J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” 2016.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
N. Kanopoulos, N. Vasanthavada, and R. L. Baker, “Design of an image edge detection filter using the sobel operator,” IEEE Journal of solid-state circuits, vol. 23, no. 2, pp. 358–367, 1988.
Y. Matsui, K. Ito, Y. Aramaki, A. Fujimoto, T. Ogawa, T. Yamasaki, and K. Aizawa, “Sketch-based manga retrieval using manga109 dataset,” Multimedia Tools and Applications, vol. 76, no. 20, pp. 21 811–21 838, 2017.
K. Aizawa, A. Fujimoto, A. Otsubo, T. Ogawa, Y. Matsui, K. Tsubota, and H. Ikuta, “Building a manga dataset “manga109” with annotations for multimedia applications,” IEEE MultiMedia, vol. 27, no. 2, pp. 8–18, 2020.
R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in International conference on curves and surfaces. Springer, 2010, pp. 711–730.
E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017.
R. E. W. Rafael C. Gonzalez, Digital Image Processing, 4th Edition. Pearson, 2018.
bloc97, “SYNLA Dataset — image super-resolution for anime-style art.” [Online]. Available: https://github.com/bloc97/SYNLA-Dataset
C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” 2015.
W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, “Real-time single image and video superresolution using an efficient sub-pixel convolutional neural network,” 2016.
W. Yang, X. Zhang, Y. Tian, W. Wang, J.-H. Xue, and Q. Liao, “Deep learning for single image super-resolution: A brief review,” IEEE Transactions on Multimedia, vol. 21, no. 12, p. 3106–3121, Dec 2019. [Online]. Available: http://dx.doi.org/10.1109/TMM.2019.2919431
J. Yang, J. Wright, T. Huang, and Y. Ma, “Image super-resolution as sparse representation of raw image patches,” in 2008 IEEE conference on computer vision and pattern recognition. IEEE, 2008, pp. 1–8.
J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE transactions on image processing, vol. 19, no. 11, pp. 2861–2873, 2010.
C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photo-realistic single image super-resolution using a generative adversarial network,” 2017.
B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks for single image super-resolution,” 2017.
Anonymous, Danbooru community, and G. Branwen, “Danbooru2020: A large-scale crowdsourced and tagged anime illustration dataset,” https://www.gwern.net/Danbooru2020, January 2021. [Online]. Available: https://www.gwern.net/Danbooru2020
Tyler, “Dandere2x— fast waifu2x video upscaling.” [Online]. Available: https://github.com/akai-katto/dandere2x
Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural computation, vol. 1, no. 4, pp. 541–551, 1989.
Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 2472–2481.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.
M. Haris, G. Shakhnarovich, and N. Ukita, “Deep back-projection networks for super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1664–1673.
H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on computational imaging, vol. 3, no. 1, pp. 47–57, 2016.
Z. Lu and Y. Chen, “Single image super resolution based on a modified u-net with mixed gradient loss,” 2019.
Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015.
J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” 2016.
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
N. Kanopoulos, N. Vasanthavada, and R. L. Baker, “Design of an image edge detection filter using the sobel operator,” IEEE Journal of solid-state circuits, vol. 23, no. 2, pp. 358–367, 1988.
Y. Matsui, K. Ito, Y. Aramaki, A. Fujimoto, T. Ogawa, T. Yamasaki, and K. Aizawa, “Sketch-based manga retrieval using manga109 dataset,” Multimedia Tools and Applications, vol. 76, no. 20, pp. 21 811–21 838, 2017.
K. Aizawa, A. Fujimoto, A. Otsubo, T. Ogawa, Y. Matsui, K. Tsubota, and H. Ikuta, “Building a manga dataset “manga109” with annotations for multimedia applications,” IEEE MultiMedia, vol. 27, no. 2, pp. 8–18, 2020.
R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in International conference on curves and surfaces. Springer, 2010, pp. 711–730.
E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017.
Publicado
18/10/2021
Como Citar
NEPOMUCENO, Raphael; SILVA, Michel M..
Evaluating Loss Functions for Illustration Super-Resolution Neural Networks. In: WORKSHOP DE TRABALHOS DA GRADUAÇÃO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 34. , 2021, Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.
p. 206-211.
DOI: https://doi.org/10.5753/sibgrapi.est.2021.20040.