Heat It Up: Using Robo-gym to Warm-up Deep Reinforcement Learning Algorithms

Gabriel Bermudez; Matheus A. Do Carmo Alves; Gabriel D. G. Pedro; Thiago Boaventura

Gabriel Bermudez USP
Matheus A. Do Carmo Alves USP
Gabriel D. G. Pedro USP
Thiago Boaventura USP

Resumo

Training deep Reinforcement Learning (deep RL) algorithms in robotics often requires acquiring a large amount of data, which is a challenging and expensive process. Although there is a gap between simulated and real environments, simulated environments enable safe and continuous data collection with minimal human intervention. In this work, we present a consistent and lightweight approach for training deep RL algorithms in a complex robotic context. Our proposal adapts the robo-gym framework to run a hybrid training process, performing a warm-up using only data from simulations (OpenAI Gym) before incorporating the complexity of the robotics models (ROS and Gazebo). We study the classic inverted pendulum swing-up problem using three different state-of-the-art baselines. Overall, our approach can significantly improve the learning process, boosting the training quality up to 26% by performing warm-ups. Our quickest warm-up takes only 2 minutes and can improve the initial learning point by up to 83%, saving 91% of training time to reach the same reward with a traditional approach.

Palavras-chave: Training, Heating systems, Adaptation models, Operating systems, Data collection, Deep reinforcement learning, Data models, Complexity theory, Proposals, Robots, Deep Reinforcement Learning. Simulation. Robot Operating System