A 3D Q-Learning Algorithm for Offline UAV Path Planning with Priority Shifting Rewards

  • Kevin Braathen de Carvalho UFV
  • Hiago B. Batista UFV
  • Iure L. de Oliveira UFV
  • Alexandre S. Brandão UFV


Autonomous navigational robotics is a field of great importance due to its vast array of applications such as exploration, transportation, industry or defense. When it comes to theses scenarios, Unmanned Aerial Vehicles (UAV), can enable different approaches that can increase the task's efficiency and/or flexibility. In this paper we propose an offline path planning for static 3D environments using Q-Learning. The reward shaping is done in such a fashion that is able to account for three different priorities, namely path length, energy consumption and safety, that can be tuned freely by the user to suit the desired application. The proposed algorithm is able to guide the agent towards the goal from anywhere in the map, which can be helpful in scenarios where internal or external instabilities that can lead the agent stray from its main path may be expected. Scalability tests where also done to benchmark the proposed method's performance for larger maps.
Palavras-chave: Three-dimensional displays, Q-learning, Service robots, Navigation, Transportation industry, Scalability, Autonomous aerial vehicles, Mobile Robotics, Reinforcement Learning, Path Planning, Unmanned Aerial Vehicles
Como Citar

Selecione um Formato
CARVALHO, Kevin Braathen de ; BATISTA, Hiago B.; OLIVEIRA, Iure L. de; BRANDÃO, Alexandre S.. A 3D Q-Learning Algorithm for Offline UAV Path Planning with Priority Shifting Rewards. In: SIMPÓSIO BRASILEIRO DE ROBÓTICA E SIMPÓSIO LATINO AMERICANO DE ROBÓTICA (SBR/LARS), 19. , 2022, São Bernardo do Campo/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 169-174.