Action Exploration in Portfolio Optimization with Reinforcement Learning

Caio de Souza Barbosa Costa; Anna Helena Reali Costa

doi:10.5753/eniac.2024.245250

Caio de Souza Barbosa Costa USP
Anna Helena Reali Costa USP

DOI: https://doi.org/10.5753/eniac.2024.245250

Resumo

In portfolio optimization, an agent continuously rebalances the assets of a financial portfolio to maximize its long-term value. With advancements in artificial intelligence, several machine learning methods have been employed to develop agents capable of effectively managing portfolios. Among these, reinforcement learning agents have achieved significant success, particularly after the introduction of a specialized policy gradient algorithm that is currently the state-of-the-art training algorithm of the research field. However, the full-exploitation characteristic of the algorithm hinders the agent’s exploration ability – an essential aspect of reinforcement learning – resulting in the generation of sub-optimal strategies that may even reduce the final portfolio value. To overcome this challenge, this paper explores the integration of noise functions to improve exploration in the agent’s action space. Three distinct noise formulations adapted to the portfolio optimization task are evaluated through experiments in the Brazilian market. The results indicate that these noise-driven exploration strategies effectively mitigate the risk of sub-optimal policy generation and significantly improve overall portfolio performance.

Palavras-chave: Portfolio Optimization, Reinforcement Learning, Action Exploration, Quantitative Finance

Referências

Cartea, Á., Jaimungal, S., and Penalva, J. (2015). Algorithmic and High-Frequency Trading. Cambridge University Press.

Costa, C. d. S. B. and Costa, A. H. R. (2023). POE: A General Portfolio Optimization Environment for FinRL. In Anais Do Brazilian Workshop on Artificial Intelligence in Finance (BWAIF), pages 132–143. SBC.

Felizardo, L. K., Paiva, F. C. L., Costa, A. H. R., and Del-Moral-Hernandez, E. (2022). Reinforcement Learning Applied to Trading Systems: A Survey.

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2015). Bayesian Data Analysis. Chapman and Hall/CRC, New York, 3 edition.

Gunjan, A. and Bhattacharyya, S. (2022). A brief review of portfolio optimization techniques. Artificial Intelligence Review.

Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the Knowledge in a Neural Network.

Jiang, Z., Xu, D., and Liang, J. (2017). A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.

Liang, Z., Chen, H., Zhu, J., Jiang, K., and Li, Y. (2018). Adversarial Deep Reinforcement Learning in Portfolio Management.

Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning.

Markowitz, H. (1952). Portfolio Selection. The Journal of Finance, 7(1):77–91.

Ross, S. (2014). Introduction to Probability Models. In Introduction to Probability Models (Eleventh Edition), page iii. Academic Press, Boston.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms.

Shi, S., Li, J., Li, G., and Pan, P. (2019). A Multi-Scale Temporal Feature Aggregation Convolutional Neural Network for Portfolio Management. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pages 1613–1622, Beijing China. ACM.

Shi, S., Li, J., Li, G., Pan, P., Chen, Q., and Sun, Q. (2022). GPM: A graph convolutional network based reinforcement learning framework for portfolio management. Neurocomputing, 498:14–27.

Soleymani, F. and Paquet, E. (2021). Deep graph convolutional reinforcement learning for financial portfolio management – DeepPocket. Expert Systems with Applications, 182:115127.

Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.

Xu, K., Zhang, Y., Ye, D., Zhao, P., and Tan, M. (2020). Relation-Aware Transformer for Portfolio Policy Learning. In Twenty-Ninth International Joint Conference on Artificial Intelligence, volume 5, pages 4647–4653.

Action Exploration in Portfolio Optimization with Reinforcement Learning

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)