Towards the Integration of Reinforcement Learning into MASPY
Resumo
Learning in symbolic agent architectures remains a key challenge in the development of adaptive multi-agent systems. This paper introduces a learning module for MASPY, a Python-based framework inspired by the Belief-Desire-Intention (BDI) model. The module enables agents to learn optimal actions using tabular reinforcement learning algorithms, such as Q-Learning and SARSA. To support this, we propose the SART methodology, which decomposes the learning environment into four structured components: States, Actions, Rewards, and Transitions. This structure allows MASPY agents to perceive their environment through defined percepts, act through decorated functions, and adapt over time using discrete learning strategies. The learning module offers a unified Python-based architecture for symbolic reasoning agents that learn through reinforcement training. This is shown practically with a toy problem where agents are able to learn to execute the actions of a previously unknown environment.
Referências
Bratman, M. (1987). Intention, Plans, and Practical Reason. Cambridge: Cambridge, MA: Harvard University Press.
Hu, K., Li, M., Song, Z., Xu, K., Xia, Q., Sun, N., Zhou, P., and Xia, M. (2024). A review of research on reinforcement learning algorithms for multi-agents. Neurocomputing, page 128068.
Mellado, A. L. L., G., F. I., Alves, G. V., and Borges, A. P. (2023). Maspy: Towards the creation of bdi multi-agent systems. In Proceedings of the 17th Workshop-School on Agents, Environments, and Applications (WESAAC 2023), pages 106–117.
Patrascu, A. T. (2025). Constructive symbolic reinforcement learning via intuitionistic logic and goal-chaining inference.
Sarathy, V., Kasenberg, D., Goel, S., Sinapov, J., and Scheutz, M. (2020). Spotter: Extending symbolic planning operators through targeted reinforcement learning.
Shindo, H., Delfosse, Q., Dhami, D. S., and Kersting, K. (2025). Blendrl: A framework for merging symbolic and neural policy learning.
Subramanian, C., Liu, M., Khan, N., Lenchner, J., Amarnath, A., Swaminathan, S., Riegel, R., and Gray, A. (2024). A neuro-symbolic approach to multi-agent rl for interpretability and probabilistic decision making.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. MIT press.
Zhu, C., Dastani, M., and Wang, S. (2024). A survey of multi-agent deep reinforcement learning with communication. Autonomous Agents and Multi-Agent Systems, 38(1):4.
Zou, J., Zhang, X., He, Y., Zhu, N., and Leng, T. (2024). Fgeo-drl: Deductive reasoning for geometric problems through deep reinforcement learning.
