BovDB: A data set of stock quotes for Machine Learning on all companies from B3 between 1995 and 2020

Resumo


Stock markets are responsible for the movement of huge amounts of financial resources around the world. This market generates a high volume of transaction data, which after being analyzed are very useful for many applications. In this paper we present BovDB, a data set that was built considering as source the Brazilian Stock Exchange (B3) with information related to the years between 1995 and 2020. We have approached the events’ impact on the stocks by applying a cumulative factor to correct prices. The results were compared with public data from InfoMoney and BR Investing, showing that our methods are valid and in accordance with the market standards. BovDB data set can be used as a benchmark for different applications and is publicly available for any researcher on GitHub.
Palavras-chave: Time series, Stocks, B3, Data set

Referências

Alhnaity, B. and Abbod, M. (2020). A new hybrid financial time series prediction model. Engineering Applications of Artificial Intelligence, 95:103873.

Allen, G. and Owens, M. (2010). The Definitive Guide to SQLite. Apress, USA, 2nd edition.

Basak, S., Kar, S., Saha, S., Khaidem, L., and Dey, S. R. (2019). Predicting the direction of stock market prices using tree-based classifiers. The North American Journal of Economics and Finance, 47:552–567.

Bustos, O. and Pomares-Quimbaya, A. (2020). Stock market movement forecast: A systematic review. Expert Systems with Applications, 156:113464.

Del Ángel, R. G. (2020). Financial time series forecasting using artificial neural networks. Revista Mexicana de Economía y Finanzas Nueva Época REMEF, 15(1):105–122.

Efimov, D., Xu, D., Kong, L., Nefedov, A., and Anandakrishnan, A. (2020). Using generative adversarial networks to synthesize artificial financial datasets. arXiv preprint arXiv:2002.02271.

Garcia-Molina, H. (2008). Database systems: the complete book. Pearson Education India.

Guo, Y., Han, S., Shen, C., Li, Y., Yin, X., and Bai, Y. (2018). An adaptive svr for high-frequency stock price forecasting. IEEE Access, 6:11397–11404.

Harris, R. D. (1997). Stock markets and development: A re-assessment. European Economic Review, 41(1):139–146.

Hu, Z., Zhao, Y., and Khushi, M. (2021). A survey of forex and stock price prediction using deep learning. Applied System Innovation, 4(1):9.

Li, A. W. and Bastos, G. S. (2020). Stock market forecasting using deep learning and technical analysis: a systematic review. IEEE Access, 8:185232–185242.

Nti, I. K., Adekoya, A. F., and Weyori, B. A. (2019). A systematic review of fundamental and technical analysis of stock market predictions. Artificial Intelligence Review, pages 1–51.

Rahat, A. M., Kahir, A., and Masum, A. K. M. (2019). Comparison of naive bayes and svm algorithm based on sentiment analysis using review dataset. In 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART), pages 266–270. IEEE.

Rousis, P. and Papathanasiou, S. (2018). Is technical analysis profitable on athens stock exchange? Mega Journal of Business Research, 2018.

Schierholt, K. and Dagli, C. H. (1996). Stock market prediction using different neural network classification architectures. In IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering (CIFEr), pages 72–78. IEEE.

Sezer, O. B., Gudelek, M. U., and Ozbayoglu, A. M. (2020). Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied Soft Computing, 90:106181.

Sowinska, K. and Madhyastha, P. (2020). A tweet-based dataset for company-level stock return prediction. arXiv preprint arXiv:2006.09723.

Thomaz, P. S., de Mattos, V. L. D., Nakamura, L. R., et al. (2021). Modeling volatility’s long-range persistence and asymmetry effect of bradesco bank stock prices using garch models. International Journal of Development Research, 11(03):45532–45543.

Upadhyay, A., Bandyopadhyay, G., and Dutta, A. (2012). Forecasting stock performance in indian market using multinomial logistic regression. Journal of Business Studies Quarterly, 3(3):16.

Vachhani, H., Obiadat, M. S., Thakkar, A., Shah, V., Sojitra, R., Bhatia, J., and Tanwar, S. (2019). Machine learning based stock market analysis: A short survey. In International Conference on Innovative Data Communication Technologies and Application, pages 12–26. Springer.

Wang, J. (2021). The analysis of the financial market in china. Academic Journal of Business & Management, 3(2).

Zhang, E. (2021). Forecasting financial performance of companies for stock valuation. Stanford Projects Spring 2021.
Publicado
04/10/2021
Como Citar

Selecione um Formato
CARDOSO, Fabian Corrêa; MALSKA, Juan; RAMIRO, Paulo; LUCCA, Giancarlo; BORGES, Eduardo N.; MATTOS, Viviane de; BERRI, Rafael. BovDB: A data set of stock quotes for Machine Learning on all companies from B3 between 1995 and 2020. In: DATASET SHOWCASE WORKSHOP (DSW), 3. , 2021, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 21-32. DOI: https://doi.org/10.5753/dsw.2021.17411.