Residual Reinforcement Learning to Generate a Closed-Loop Policy from a Keyframe Kick Motion

  • Marcos Vinícius S. Passamani ITA
  • Marcos R. O. A. Maximo ITA
  • Luckeciano C. Melo University of Oxford / AKCIT

Resumo


The RoboCup is an international robot football competition that aims to push the boundaries of robotics and artificial intelligence to a point where a team of robotic players could beat a human World Cup champion team. One of the major categories of RoboCup is the 3D Soccer Simulation League, which consists of a full game of football executed in a physically simulated environment. In this league, keyframe interpolation is a very common practice for executing specific tasks such as getting up, diving, or kicking the ball. A primary problem with this method is its open-loop kinematic description of movement, in the sense that it lacks feedback information from the world, which makes the behavior prone to underperformance or failure in complex and noisy environments. This work contributes by providing a methodology to enhance keyframe movements using residual learning, in which we train a policy using the Proximal Policy Optimization (PPO) algorithm to improve a keyframe kick motion. The approach not only improved the reliability of the kick but also enhanced its performance metrics, including distance and precision.
Palavras-chave: Interpolation, Three-dimensional displays, Humanoid robots, Reinforcement learning, Kinematics, Performance metrics, Reliability, Noise measurement, Optimization, Sports, Residual Reinforcement Learning, RoboCup 3D Simulation, Keyframe Optimization, Humanoid Robotics, Proximal Policy Optimization (PPO)
Publicado
13/10/2025
PASSAMANI, Marcos Vinícius S.; MAXIMO, Marcos R. O. A.; MELO, Luckeciano C.. Residual Reinforcement Learning to Generate a Closed-Loop Policy from a Keyframe Kick Motion. In: SIMPÓSIO BRASILEIRO DE ROBÓTICA E SIMPÓSIO LATINO AMERICANO DE ROBÓTICA (SBR/LARS), 17. , 2025, Vitória/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 147-151.