Controlling Tiltrotors Unmanned Aerial Vehicles (UAVs) with Deep Reinforcement Learning

Aline Gabriel De Almeida; Esther Luna Colombini; Alexandre Da Silva Simões

Aline Gabriel De Almeida UNESP
Esther Luna Colombini UNICAMP
Alexandre Da Silva Simões UNESP

Resumo

Unmanned Aerial Vehicles (UAVs) have gained significant attention in various domains due to their versatility and potential applications. Effective control of UAVs is crucial for achieving desired flight behaviors and optimizing their performance. This paper presents a comprehensive exploration of learning-based approaches for controlling UAVs with fixed-rotors and tiltrotors, specifically focusing on the Proximal Policy Optimization (PPO) and Twin-Delayed Deep Deterministic Policy Gradient (TD3) algorithms. The study aims to compare and evaluate the efficacy of these two state-of-the-art reinforcement learning algorithms in controlling UAVs with varying designs and control complexities. By utilizing PPO and TD3, we address the challenges associated with maneuvering UAVs in dynamic environments and achieving precise control under different flight conditions. We conducted extensive simulations to assess the performance of PPO and TD3 algorithms in diverse UAV scenarios, considering multiple design configurations and control requirements. The evaluation criteria encompassed stability, robustness, trajectory tracking accuracy, and control efficiency. Results demonstrate the suitability and effectiveness of both PPO and TD3 in controlling UAVs.

Palavras-chave: Reinforcement Learning, Unmanned Aerial Vehicle (UAV), Proximal Policy Optimization (PPO), Twin-Delayed Deep Deterministic Policy Gradient (TD3), Tiltrotor