Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry

André O. Françani; Marcos R. O. A. Maximo

Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry

André O. Françani ITA
Marcos R. O. A. Maximo ITA

Resumo

Monocular visual odometry consists of the estimation of the position of an agent through images of a single camera, and it is applied in autonomous vehicles, medical robots, and augmented reality. However, monocular systems suffer from the scale ambiguity problem due to the lack of depth information in 2D frames. This paper contributes by showing an application of the dense prediction transformer model for scale estimation in monocular visual odometry systems. Experimental results show that the scale drift problem of monocular systems can be reduced through the accurate estimation of the depth map by this model, achieving competitive state-of-the-art performance on a visual odometry benchmark.

Palavras-chave: Visualization, Robot vision systems, Estimation, Benchmark testing, Predictive models, Transformers, Cameras, monocular visual odometry, scale estimation, deep learning, monocular depth estimation, vision transformer

IEEE Xplore (English)

Publicado

18/10/2022

Como Citar

Selecione um Formato

FRANÇANI, André O.; MAXIMO, Marcos R. O. A.. Dense Prediction Transformer for Scale Estimation in Monocular Visual Odometry. In: SIMPÓSIO BRASILEIRO DE ROBÓTICA E SIMPÓSIO LATINO AMERICANO DE ROBÓTICA (SBR/LARS), 19. , 2022, São Bernardo do Campo/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 312-317.