Push Recovery Strategies through Deep Reinforcement Learning

Dicksiano Melo; Marcos Máximo; Adilson da Cunha

Push Recovery Strategies through Deep Reinforcement Learning

Dicksiano Melo ITA
Marcos Máximo ITA
Adilson da Cunha ITA

Resumo

This work implements a Push Recovery controller that improves a walking engine used by a humanoid simulated agent from RoboCup 3D Soccer Simulation League. Therefore, we achieve a movement policy that allows the agents to re-establish their own balance when disturbed. Using the Proximal Policy Optimization algorithm, this work achieved an expert policy, represented by a Deep Neural Network, which is compatible with a Zero Moment Point walking engine. In contrast with other works, the reward signal used is much more agnostic, in a way that the learning agent was not aware of the physical constraints related to the problem. The agent was just aware that it should avoid failing in order to accumulate more reward. The proposed method is also sample eﬃcient, as it is able to learn a human inspired behavior in just a few hours.

Palavras-chave: Robots, Hip, Humanoid robots, Legged locomotion, Robot sensing systems, Engines, Torque

IEEE Xplore (English)

Publicado

09/11/2020

Como Citar

Selecione um Formato

MELO, Dicksiano; MÁXIMO, Marcos; DA CUNHA, Adilson. Push Recovery Strategies through Deep Reinforcement Learning. In: SIMPÓSIO BRASILEIRO DE ROBÓTICA E SIMPÓSIO LATINO AMERICANO DE ROBÓTICA (SBR/LARS), 17. , 2020, Natal. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 240-245.