A Model for Optimizing Biped Robot Walking Using an Inverted Pendulum and Reinforcement Learning
Abstract
This work focuses on developing an inverted pendulum prototype with reinforcement learning and a complete training environment using the BahiaRT-GYM platform. In this environment, the agent’s trunk inclination during walking was used as a practical case to allow adjustments for achieving stable and fluid locomotion. The inclination was a useful example to demonstrate the environment’s capability to support and optimize practical training. The results show a 26% performance increase with a 27.8% faster speedup of the reinforcement learning-trained model over the inverted pendulum approach. Both outperform the original cart table-based model.
References
Joschka, B. and Asada, M. (2008). Simspark–concepts and application in the robocup 3d soccer simulation league. Autonomous Robots, 174:181.
JUSTO, D., SAUTER, E., AZEVEDO, F., GUIDI, L., and KONZEN, P. (2020). Cálculo Numérico: um livro colaborativo–versão Scilab. UFRGS.
Kasaei, M., Abreu, M., Lau, N., Pereira, A., and Reis, L. P. (2021). Robust biped locomotion using deep reinforcement learning on top of an analytical control approach. Robotics and Autonomous Systems, 146:103900.
Kitano, H., Asada, M., Kuniyoshi, Y., Noda, I., Osawai, E., and Matsubara, H. (1998). Robocup: A challenge problem for ai and robotics. RoboCup-97: Robot soccer world cup I, H. Kitano, Org.
KOFINAS, N. (2012). Forward and inverse kinematics for the NAO humanoid robot. PhD thesis, Technical University of Crete.
Liu, C., Gao, J., Tian, D., Zhang, X., Liu, H., and Meng, L. (2021). A disturbance rejection control method based on deep reinforcement learning for a biped robot. Applied Sciences.
magmaOffenburg (2023). Magma challenge documentation.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms.
Shafii, N. (2015). Development of an optimized omnidirectional walk engine for humanoid robots. PhD thesis, Universidade do Porto (Portugal).
Simões, M. A., Mascarenhas, G., Fonseca, R., dos Santos, V. M., Mascarenhas, F., and Nogueira, T. (2022). Bahiart setplays collecting toolkit and bahiart gym. Software Impacts, 14:100401.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning Series. The MIT Press, Cambridge, Massachusetts, second edition edition.
Wang, M., Wang, R., Zhao, J., and Sun, P. (2018). An optimized algorithm based on energy efficiency for gait planning of humanoid robots. In IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society, pages 5612–5617. IEEE.
Yılmaz, S., Gokasan, M., and Bogosyan, S. (2020). Oscillation preventing closed-loop controllers via genetic algorithm for biped walking on flat and inclined surfaces. International Journal Of Advanced Computer Science And Applications, 11(5).
