Push Recovery Strategies through Deep Reinforcement Learning

  • Dicksiano Melo ITA
  • Marcos Máximo ITA
  • Adilson da Cunha ITA


This work implements a Push Recovery controller that improves a walking engine used by a humanoid simulated agent from RoboCup 3D Soccer Simulation League. Therefore, we achieve a movement policy that allows the agents to re-establish their own balance when disturbed. Using the Proximal Policy Optimization algorithm, this work achieved an expert policy, represented by a Deep Neural Network, which is compatible with a Zero Moment Point walking engine. In contrast with other works, the reward signal used is much more agnostic, in a way that the learning agent was not aware of the physical constraints related to the problem. The agent was just aware that it should avoid failing in order to accumulate more reward. The proposed method is also sample efficient, as it is able to learn a human inspired behavior in just a few hours.
Palavras-chave: Robots, Hip, Humanoid robots, Legged locomotion, Robot sensing systems, Engines, Torque
Como Citar

Selecione um Formato
MELO, Dicksiano; MÁXIMO, Marcos; DA CUNHA, Adilson. Push Recovery Strategies through Deep Reinforcement Learning. In: SIMPÓSIO BRASILEIRO DE ROBÓTICA E SIMPÓSIO LATINO AMERICANO DE ROBÓTICA (SBR/LARS), 17. , 2020, Natal. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 240-245.