Multi-Scale Patch Partitioning for Image Inpainting Based on Visual Transformers
ResumoImage inpainting is a challenging task that aims to reconstruct missing pixels with semantically coherent content and realistic texture using available information. Modern inpainting works rely on neural networks to generate realistic images. However, due to their limited receptive field in convolution operators, they may produce distorted content when a large region needs to be filled. Recent methods have employed transformers to deal with this problem, but their high computational cost makes it difficult to work with global image information. To address this, we propose a multi-scale patch partitioning strategy to subdivide feature maps into non-overlapping patches, and a transformer with a variable number of heads to control the computational cost growth according to the number of patches. Smaller patches enable a broader image coverage, helping to recover structural information, whereas larger patches lead to a reduced computational cost. In contrast to the fixed and small sizes employed in other literature methods, here we explore different patch sizes in the transformer blocks to achieve a good balance between the computational cost and the number of pixel references used in the reconstruction. Extensive experiments on three datasets show that our method achieves very competitive results compared to the state of the art, reaching the best scores in various scenarios, especially for metrics based on human perception. Moreover, our model presented the smallest size. Our qualitative results suggest that the proposed method is able to reconstruct structural content such as parts of human faces.
Palavras-chave: Measurement, Visualization, Convolution, Neural networks, Computer architecture, Transformers, Computational efficiency, Image inpainting, visual transformers, multi-scale patch partitioning
CAMPANA, Jose Luis Flores; DECKER, Luís Gustavo Lorgus; ROBERTO E SOUZA, Marcos; MAIA, Helena de Almeida; PEDRINI, Helio. Multi-Scale Patch Partitioning for Image Inpainting Based on Visual Transformers. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 35. , 2022, Natal/RN. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 .