Aprendizado por Reforço Profundo com Redes Recorrentes Aplicado à Negociação do Minicontrato Futuro de Dólar

Jonathan Kenji Kinoshita; Douglas De Rizzo Meneghetti; Reinaldo Augusto da Costa Bianchi

doi:10.5753/bwaif.2022.222808

Jonathan Kenji Kinoshita Centro Universitário FEI
Douglas De Rizzo Meneghetti Centro Universitário FEI
Reinaldo Augusto da Costa Bianchi Centro Universitário FEI

DOI: https://doi.org/10.5753/bwaif.2022.222808

Resumo

Recentemente, houve um aumento considerável no uso de técnicas de aprendizado de máquina no mercado financeiro, principalmente para negociação de ações, na tentativa de prever preços futuros. O objetivo desse projeto é investigar a aplicação do aprendizado por reforço em um sistema de negociação inteligente do minicontrato futuro de dólar, usando uma Deep Recurrent Q-Network, uma técnica baseada no treinamento de uma rede recorrente para resolução de problemas de aprendizado por reforço parcialmente observáveis. O treinamento foi baseado em uma base da dados históricos do ativo e o agente realizou três ações: comprar, vender ou manter o ativo, sempre visando maximizar o retorno financeiro. Os experimentos realizados indicam que o sistema pode alcançar desempenho superior à estratégia de Buy and Hold e a tradicional DQN.

Palavras-chave: Aprendizado por Reforço Profundo, Redes Neurais Convolucionais, Redes Neurais Recorrentes, Long Short-Term Network, Deep Recurrent Q-Network, Mercado Futuro

Referências

Carapuço, J., Neves, R., and Horta, N. (2018). Reinforcement learning applied to Forex trading. Applied Soft Computing, 73:783–794.

Clevert, D.-A., Unterthiner, T., and Hochreiter, S. (2016). Fast and accurate deep network learning by exponential linear units (elus).

Fischer, T. and Krauss, C. (2018). Deep learning with long short-term memory networks for financial market predictions. European Journal of Operational Research, 270(2):654–669.

Gers, F. A., Schmidhuber, J., and Cummins, F. (2000). Learning to forget: Continual prediction with lstm. Neural Computation, 12:2451–2471.

Guresen, E., Kayakutlu, G., and Daim, T. U. (2011). Using artificial neural network models in stock market index prediction. Expert Systems with Applications, 38(8):10389– 10397.

Hausknecht, M. and Stone, P. (2017). Deep Recurrent Q-Learning for Partially Observable MDPs. arXiv:1507.06527 [cs]. arXiv: 1507.06527.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Comput., 9(8):1735–1780.

Hsieh, D. A. (1991). Chaos and Nonlinear Dynamics: Application to Financial Markets. The Journal of Finance, 46(5):1839–1877.

Huang, C. Y. (2018). Financial trading as a game: A deep reinforcement learning approach.

Kara, Y., Boyacioglu, M. A., and Baykan, O. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Systems with Applications, 38(5):5311– 5319.

Kingma, D. P. and Ba, J. (2014). Adam: A Method for Stochastic Optimization.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521:436–44.

Lee, J., Kim, R., Koh, Y., and Kang, J. (2019). Global Stock Market Prediction Based on Stock Chart Images Using Deep Q-Network. IEEE Access, 7:167260–167277.

Li, Y., Ni, P., and Chang, V. (2020). Application of deep reinforcement learning in stock trading strategies and stock forecasting. Computing, 102(6):1305–1322.

Lim, B., Zohren, S., and Roberts, S. (2019). Enhancing time-series momentum strategies using deep neural networks. The Journal of Financial Data Science.

Lo, A. W., Mamaysky, H., and Wang, J. (2000). Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation. The Journal of Finance, 55(4):1705–1765.

Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., Silver, D., and Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540):529–533.

Sutton, R. S. and Barto, A. G. (1998). Reinforcement learning: an introduction. Adaptive computation and machine learning. MIT Press, Cambridge, Mass.

Théate, T. and Ernst, D. (2021). An application of deep reinforcement learning to algorithmic trading. Expert Systems with Applications, 173:114632.

van Hasselt, H., Guez, A., and Silver, D. (2015). Deep Reinforcement Learning with Double Q-learning.

Watkins, C. J. C. H. (1989). Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK.

Williams, R. J. (1992). Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 8(3-4):229–256.

Xiong, Z., Liu, X.-Y., Zhong, S., Yang, H., and Walid, A. (2018). Practical Deep Reinforcement Learning Approach for Stock Trading. arXiv:1811.07522 [cs, q-fin, stat]. arXiv: 1811.07522.

Zhang, Z., Zohren, S., and Roberts, S. (2020). Deep Reinforcement Learning for Trading. The Journal of Financial Data Science, 2(2):25–40.

Zheng, Y., Liu, Q., Chen, E., Ge, Y., and Zhao, J. L. (2014). Time series classification using multi-channels deep convolutional neural networks. In WAIM.