Does the Input Image Spatial Resolution Generate Different Synthetic Images? A Comparative Study of Facial Expression Synthesis Performance

Resumo


Facial expression synthesis has gained significant attention in the image synthesis field. Generative Adversarial Network (GAN) models have recently gained popularity due to the high-quality synthetic images they produce. However, these models require complex network architectures that can take days to train, even with high-performance Graphics Processing Units (GPUs). Many efforts have been made to accelerate and compress such models, but little attention has been paid to the resolution of the images. This study aims to assess the impact of input/output spatial resolution on the resources needed for training a facial expression synthesis model, as well as on the quality of the results. Our results indicate that the produced images and videos had similar quality results measured through objective measures for the spatial resolution of 128 × 128, 256 × 256, and 480 × 480. Furthermore, we found that lower-resolution images could significantly reduce the time required to generate new facial expressions without compromising quality, as measured by objective measures.
Publicado
06/11/2023
TESTA, Rafael Luiz; MACHADO-LIMA, Ariane; NUNES, Fátima L. S.. Does the Input Image Spatial Resolution Generate Different Synthetic Images? A Comparative Study of Facial Expression Synthesis Performance. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 36. , 2023, Rio Grande/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 175-180.