POE: A General Portfolio Optimization Environment for FinRL

Caio de Souza Barbosa Costa; Anna Helena Reali Costa

doi:10.5753/bwaif.2023.231144

Caio de Souza Barbosa Costa USP
Anna Helena Reali Costa USP

DOI: https://doi.org/10.5753/bwaif.2023.231144

Resumo

Portfolio optimization is a common task in financial markets in which a manager rebalances the invested assets in the portfolio periodically aiming to make a profit, minimize losses and maximize long-term returns. Due to their great adaptability, Reinforcement Learning (RL) techniques are considered convenient for this task but, despite RL’s great results, there is a lack of standardization related to simulation environments. In this paper, we present an RL environment for the portfolio optimization problem based on state-of-the-art mathematical formulations. The environment aims to be easy-to-use, very customizable, and have integrations with modern RL frameworks.

Palavras-chave: Portfolio optimization, Reinforcement learning, Simulation environment, Quantitative finance

Referências

Almahdi, S. and Yang, S. Y. (2017). An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown. Expert Systems with Applications, 87:267–279.

Amrouni, S. (2022). Selimamrouni/Deep-Portfolio-Management-Reinforcement-Learning: V2.0. Zenodo.

Aroussi, R. (2023). Ranaroussi/quantstats.

Betancourt, C. and Chen, W.-H. (2021). Deep reinforcement learning for portfolio management of markets with a dynamic number of assets. Expert Systems with Applications, 164:114002.

Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., and Zaremba, W. (2016). OpenAI Gym.

Deng, Y., Bao, F., Kong, Y., Ren, Z., and Dai, Q. (2017). Deep Direct Reinforcement Learning for Financial Signal Representation and Trading. IEEE Transactions on Neural Networks and Learning Systems, 28(3):653–664.

Haghpanah, M. A. (2021). Gym-mtsim.

Haghpanah, M. A. (2023). Gym-anytrading.

Henrique, B. M., Sobreiro, V. A., and Kimura, H. (2019). Literature review: Machine learning techniques applied to financial market prediction. Expert Systems with Applications, 124:226–251.

Hu, Y., Liu, K., Zhang, X., Su, L., Ngai, E. W. T., and Liu, M. (2015). Application of evolutionary computation for rule discovery in stock algorithmic trading: A literature review. Applied Soft Computing, 36:534–551.

Jiang, Z., Xu, D., and Liang, J. (2017). A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem.

Khadjeh Nassirtoussi, A., Aghabozorgi, S., Ying Wah, T., and Ngo, D. C. L. (2014). Text mining for market prediction: A systematic review. Expert Systems with Applications, 41(16):7653–7670.

Liu, X.-Y. (2022). FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning.

Liu, X.-Y., Li, Z., Yang, Z., Zheng, J., Wang, Z., Walid, A., Guo, J., and Jordan, M. I. (2022a). ElegantRL-Podracer: Scalable and Elastic Library for Cloud-Native Deep Reinforcement Learning.

Liu, X.-Y., Yang, H., Gao, J., and Wang, C. D. (2022b). FinRL: Deep reinforcement learning framework to automate trading in quantitative finance. In Proceedings of the Second ACM International Conference on AI in Finance, ICAIF ’21, pages 1–9, New York, NY, USA. Association for Computing Machinery.

Magdon-Ismail, M., Atiya, A. F., Pratap, A., and Abu-Mostafa, Y. S. (2004). On the maximum drawdown of a Brownian motion. Journal of Applied Probability, 41(1):147–161.

Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., and Dormann, N. (2021). Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research, 22(268):1–8.

Sharpe,W. F. (1994). The Sharpe Ratio. The Journal of Portfolio Management, 21(1):49–58.

Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.

Team, T. P. D. (2023). Pandas-dev/pandas: Pandas. Zenodo.

Weng, L., Sun, X., Xia, M., Liu, J., and Xu, Y. (2020). Portfolio trading system of digital currencies: A deep reinforcement learning with multidimensional attention gating mechanism. Neurocomputing, 402:171–182.