Strategic Adjustments to Prioritized Experience Replay for Control Challenges: Study with DQN on CartPole
Abstract
This paper explores modifications to the Prioritized Experience Replay (PER) technique proposed by Schaul et al. (2015), applied to the Deep Q-Network (DQN) algorithm by Mnih et al. (2015). The chosen challenge for implementation was CartPole, with the goal of improving efficiency and maximizing the agent’s reward. New approaches were developed that introduce different strategies for prioritizing samples. The proposed versions are compared with the original PER technique, adjusted with the same parameters to ensure a fair and accurate analysis. Experimental results showed that there are versions that provide a performance increase of over 25%, leading to significantly higher rewards.
Keywords:
Prioritized Experience Replay, Deep Q-Network
References
Hessel, M., Modayil, J., Van Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M., and Silver, D. (2018). Rainbow: Combining improvements in deep reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning, 8:293–321.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. nature, 518(7540):529–533.
Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2015). Prioritized experience replay. arXiv preprint arXiv:1511.05952.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning, 8:293–321.
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-level control through deep reinforcement learning. nature, 518(7540):529–533.
Schaul, T., Quan, J., Antonoglou, I., and Silver, D. (2015). Prioritized experience replay. arXiv preprint arXiv:1511.05952.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
Published
2024-11-17
How to Cite
MENEZES, Bruno F.; RAMOS, Kaio M.; BARRETO, Gabriel G. S.; BOTELHO, Nícolas G.; BRAGA, Arthur P. de S..
Strategic Adjustments to Prioritized Experience Replay for Control Challenges: Study with DQN on CartPole. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 21. , 2024, Belém/PA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 340-351.
ISSN 2763-9061.
DOI: https://doi.org/10.5753/eniac.2024.245100.
