Zero-Shot Synth-to-Real Depth Estimation: From Synthetic Street Scenes to Real-World Data

  • Luís Gustavo Lorgus Decker UNICAMP
  • Jose Luis Flores Campana UNICAMP
  • Marcos Roberto e Souza UNICAMP
  • Helena de Almeida Maia UNICAMP
  • Helio Pedrini UNICAMP

Resumo


This paper introduces a novel method for estimating depth maps from single images using a convolutional neural network (CNN) architecture. Our approach leverages synthetically generated data that simulates front views from autonomous vehicles. The model employs a ConvNeXt encoder and a U-Net-based decoder, achieving effective depth estimation performance. A scale- and shift-invariant loss function is utilized during training to enhance generalization capabilities. The proposed model demonstrates strong results on real-world datasets without requiring fine-tuning, highlighting the effective transfer of information from synthetic to real-world data. Additionally, we show that high-quality data significantly improves performance compared to larger, low-quality datasets.
Palavras-chave: Training, Graphics, Sensitivity, Filtering, Data integrity, Estimation, Feature extraction, Data models, Convolutional neural networks, Synthetic data
Publicado
30/09/2024
DECKER, Luís Gustavo Lorgus; CAMPANA, Jose Luis Flores; SOUZA, Marcos Roberto e; MAIA, Helena de Almeida; PEDRINI, Helio. Zero-Shot Synth-to-Real Depth Estimation: From Synthetic Street Scenes to Real-World Data. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 37. , 2024, Manaus/AM. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 .