Reinforcement Learning and Loss of Plasticity Phenomenon in Coverage Path Planning Environments: An Exploratory Study

  • João Lucas Cadorniga INSPER
  • Pedro Pertusi INSPER
  • Fabrício J. Barth INSPER

Abstract


This paper investigates the phenomenon of loss of plasticity in the context of Reinforcement Learning (RL) applied to Coverage Path Planning (CPP) environments, particularly with the use of curriculum learning. Unlike shortest-path problems, CPP focuses on systematically covering an area. Recent advances in deep RL often face challenges with continual learning, where the network’s ability to adapt diminishes with successive new tasks. We developed a reproducible Gym environment for single-agent CPP with varying difficulty levels and evaluated three RL models: Deep Q-Network (DQN), Proximal Policy Optimization (PPO) without L2 regularization, and PPO with L2 regularization. Our findings indicate that loss of plasticity is present in this CPP task, especially with the DQN model, as agents struggled to generalize and retain knowledge across different map layouts. Curriculum learning, in this context, did not improve the agent’s performance and sometimes appeared to hinder learning new levels. While PPO models showed slight improvements in learning later levels compared to DQN, none of the tested models fully mitigated the loss of plasticity.

References

Dohare, S., Hernandez-Garcia, J. F., Lan, Q., Rahman, P., Mahmood, A. R., and Sutton, R. S. (2024). Loss of plasticity in deep continual learning. Nature, 632:768—774.

Galceran, E. and Carreras, M. (2013). A survey on coverage path planning for robotics. Robotics and Autonomous Systems, 61(12):1258–1276.

Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., and Dormann, N. (2021). Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22(268):1–8.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, abs/1707.06347.

Tan, C. S., Mohd-Mokhtar, R., and Arshad, M. R. (2021). A comprehensive review of coverage path planning in robotics using classical and heuristic algorithms. IEEE Access, 9:119310–119342.

Towers, M., Kwiatkowski, A., Terry, J., Balis, J. U., Cola, G. D., Deleu, T., Goulão, M., Kallinteris, A., Krimmel, M., KG, A., Perez-Vicente, R., Pierré, A., Schulhoff, S., Tai, J. J., Tan, H., and Younis, O. G. (2024). Gymnasium: A standard interface for reinforcement learning environments.

van Hasselt, H., Guez, A., and Silver, D. (2015). Deep reinforcement learning with double q-learning. cite arxiv:1509.06461Comment: AAAI 2016.
Published
2025-09-29
CADORNIGA, João Lucas; PERTUSI, Pedro; BARTH, Fabrício J.. Reinforcement Learning and Loss of Plasticity Phenomenon in Coverage Path Planning Environments: An Exploratory Study. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 1962-1971. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2025.14347.