CBNAV: Costmap Based Approach to Deep Reinforcement Learning Mobile Robot Navigation

Darci Luiz Tomasi; Eduardo Todt

Darci Luiz Tomasi UFPR
Eduardo Todt UFPR

Resumo

This research presents a novel deep reinforcement learning approach applied in mobile robot context, where the reward function is based on cockroaches sensor adaption and insect brain temporal and spatial navigation skills. Initially the environment is unknown and the agent is trained to construct the costmap to reward the actions of the agent, based on the most promising regions to be in the environment. For each training epoch the robot path is mapped as a costmap as well, in order to buffer the steps performed and penalizes the ones not in accordance with pre-defined navigation rules. Train and test are performed in simulated conditions, concerning five different environments, and an agent equipped with LiDAR sensor providing 1081 distance measurements per step. Cross-validation between each trained model and the average of times that the target was reached are considered as evaluation criteria. Experimental results show that the insertion of costmap concepts, Costmap Reward Function (CRF) and Costmap Path Function (CPF), increase the generalization of the model characterizing a statistically promissing approach.

Palavras-chave: Training, Laser radar, Navigation, Statistical analysis, Insects, Reinforcement learning, Robot sensing systems, Deep Reinforcement Learning, Indoor Mobile Robot, Reward Function, Costmap, Double Q-learning, Dueling DDQN