Action Exploration in Portfolio Optimization with Reinforcement Learning
Resumo
In portfolio optimization, an agent continuously rebalances the assets of a financial portfolio to maximize its long-term value. With advancements in artificial intelligence, several machine learning methods have been employed to develop agents capable of effectively managing portfolios. Among these, reinforcement learning agents have achieved significant success, particularly after the introduction of a specialized policy gradient algorithm that is currently the state-of-the-art training algorithm of the research field. However, the full-exploitation characteristic of the algorithm hinders the agent’s exploration ability – an essential aspect of reinforcement learning – resulting in the generation of sub-optimal strategies that may even reduce the final portfolio value. To overcome this challenge, this paper explores the integration of noise functions to improve exploration in the agent’s action space. Three distinct noise formulations adapted to the portfolio optimization task are evaluated through experiments in the Brazilian market. The results indicate that these noise-driven exploration strategies effectively mitigate the risk of sub-optimal policy generation and significantly improve overall portfolio performance.
Palavras-chave:
Portfolio Optimization, Reinforcement Learning, Action Exploration, Quantitative Finance
Referências
Cartea, Á., Jaimungal, S., and Penalva, J. (2015). Algorithmic and High-Frequency Trading. Cambridge University Press.
Costa, C. d. S. B. and Costa, A. H. R. (2023). POE: A General Portfolio Optimization Environment for FinRL. In Anais Do Brazilian Workshop on Artificial Intelligence in Finance (BWAIF), pages 132–143. SBC.
Felizardo, L. K., Paiva, F. C. L., Costa, A. H. R., and Del-Moral-Hernandez, E. (2022). Reinforcement Learning Applied to Trading Systems: A Survey.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2015). Bayesian Data Analysis. Chapman and Hall/CRC, New York, 3 edition.
Gunjan, A. and Bhattacharyya, S. (2022). A brief review of portfolio optimization techniques. Artificial Intelligence Review.
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the Knowledge in a Neural Network.
Jiang, Z., Xu, D., and Liang, J. (2017). A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.
Liang, Z., Chen, H., Zhu, J., Jiang, K., and Li, Y. (2018). Adversarial Deep Reinforcement Learning in Portfolio Management.
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning.
Markowitz, H. (1952). Portfolio Selection. The Journal of Finance, 7(1):77–91.
Ross, S. (2014). Introduction to Probability Models. In Introduction to Probability Models (Eleventh Edition), page iii. Academic Press, Boston.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms.
Shi, S., Li, J., Li, G., and Pan, P. (2019). A Multi-Scale Temporal Feature Aggregation Convolutional Neural Network for Portfolio Management. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 1613–1622, Beijing China. ACM.
Shi, S., Li, J., Li, G., Pan, P., Chen, Q., and Sun, Q. (2022). GPM: A graph convolutional network based reinforcement learning framework for portfolio management. Neurocomputing, 498:14–27.
Soleymani, F. and Paquet, E. (2021). Deep graph convolutional reinforcement learning for financial portfolio management – DeepPocket. Expert Systems with Applications, 182:115127.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.
Xu, K., Zhang, Y., Ye, D., Zhao, P., and Tan, M. (2020). Relation-Aware Transformer for Portfolio Policy Learning. In Twenty-Ninth International Joint Conference on Artificial Intelligence, volume 5, pages 4647–4653.
Costa, C. d. S. B. and Costa, A. H. R. (2023). POE: A General Portfolio Optimization Environment for FinRL. In Anais Do Brazilian Workshop on Artificial Intelligence in Finance (BWAIF), pages 132–143. SBC.
Felizardo, L. K., Paiva, F. C. L., Costa, A. H. R., and Del-Moral-Hernandez, E. (2022). Reinforcement Learning Applied to Trading Systems: A Survey.
Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2015). Bayesian Data Analysis. Chapman and Hall/CRC, New York, 3 edition.
Gunjan, A. and Bhattacharyya, S. (2022). A brief review of portfolio optimization techniques. Artificial Intelligence Review.
Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the Knowledge in a Neural Network.
Jiang, Z., Xu, D., and Liang, J. (2017). A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.
Liang, Z., Chen, H., Zhu, J., Jiang, K., and Li, Y. (2018). Adversarial Deep Reinforcement Learning in Portfolio Management.
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning.
Markowitz, H. (1952). Portfolio Selection. The Journal of Finance, 7(1):77–91.
Ross, S. (2014). Introduction to Probability Models. In Introduction to Probability Models (Eleventh Edition), page iii. Academic Press, Boston.
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms.
Shi, S., Li, J., Li, G., and Pan, P. (2019). A Multi-Scale Temporal Feature Aggregation Convolutional Neural Network for Portfolio Management. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 1613–1622, Beijing China. ACM.
Shi, S., Li, J., Li, G., Pan, P., Chen, Q., and Sun, Q. (2022). GPM: A graph convolutional network based reinforcement learning framework for portfolio management. Neurocomputing, 498:14–27.
Soleymani, F. and Paquet, E. (2021). Deep graph convolutional reinforcement learning for financial portfolio management – DeepPocket. Expert Systems with Applications, 182:115127.
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.
Xu, K., Zhang, Y., Ye, D., Zhao, P., and Tan, M. (2020). Relation-Aware Transformer for Portfolio Policy Learning. In Twenty-Ninth International Joint Conference on Artificial Intelligence, volume 5, pages 4647–4653.
Publicado
17/11/2024
Como Citar
COSTA, Caio de Souza Barbosa; COSTA, Anna Helena Reali.
Action Exploration in Portfolio Optimization with Reinforcement Learning. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 21. , 2024, Belém/PA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 316-327.
ISSN 2763-9061.
DOI: https://doi.org/10.5753/eniac.2024.245250.