TVAnet: a spatial and feature-based attention model for self-driving car

Victor Flores-Benites; Carlos A. Mugruza-Vassallo; Rensso Mora-Colque

Victor Flores-Benites Universidad Catolica San Pablo
Carlos A. Mugruza-Vassallo Universidad Nacional Tecnologica de Lima Sur
Rensso Mora-Colque Universidad Catolica San Pablo

Resumo

End-to-end methods facilitate the development of self-driving models by employing a single network that learns the human driving style from examples. However, these models face problems of distributional shift problem, causal confusion, and high variance. To address these problems we propose two techniques. First, we propose the priority sampling algorithm, which biases the training sampling towards unknown observations for the model. Priority sampling employs a trade-off strategy that incentivizes the training algorithm to explore the whole dataset. Our results show uniform training on the dataset, as well as improved performance. As a second approach, we propose a model based on the theory of visual attention, called TVAnet, by which selecting relevant visual information to build an optimal environment representation. TVAnet employs two visual information selection mechanisms: spatial and feature-based attention. Spatial attention selects regions with visual encoding similar to contextual encoding, while feature-based attention selects features disentangled with useful information for routine driving. Furthermore, we encourage the model to recognize new sources of visual information by adding a bottom-up input. Results in the CoRL-2017 dataset show that our spatial attention mechanism recognizes regions relevant to the driving task. TVAnet builds disentangled features with low mutual dependence. Furthermore, our model is interpretable, facilitating the intelligent vehicle behavior. Finally, we report performance improvements over traditional end-to-end models.

Palavras-chave: Training, Visualization, Intelligent vehicles, Face recognition, Encoding, Autonomous automobiles, Task analysis, visual attention, self driving, spatial attention, feature based attention