BovDB: a data set of stock prices of all companies in B3 from 1995 to 2020

Authors

  • Fabian Corrêa Cardoso Universidade Federal do Rio Grande
  • Juan Andrey Valverde Malska Universidade Federal do Rio Grande
  • Paulo Junior Ramiro Universidade Federal do Rio Grande
  • Giancarlo Lucca Universidade Federal do Rio Grande
  • Eduardo Nunes Borges Universidade Federal do Rio Grande
  • Viviane Leite Dias de Mattos Universidade Federal do Rio Grande
  • Rafael Alceste Berri Universidade Federal do Rio Grande

DOI:

https://doi.org/10.5753/jidm.2022.2345

Keywords:

B3, Data set, Stocks, Time series

Abstract

Stock markets are responsible for the movement of vast amounts of financial resources worldwide. This market generates a high volume of transaction data, which after being analyzed are very useful for many applications. In this article, we present BovDB, a data set that was built considering a source of the Brazilian Stock Exchange (B3) with information related to the years between 1995 and 2020. We have approached the events’ impact on the stocks by
applying a cumulative factor to correct prices. The results were compared with public data from InfoMoney and BR Investing, showing that our methods, are valid and follow the market standards, based on the proposed factor. BovDB
data set can be used as a benchmark for different applications and it is available in open access for any researcher on GitHub.

Downloads

Download data is not yet available.

References

Alhnaity, B. and Abbod, M. A new hybrid financial time series prediction model. Engineering Applications of Artificial Intelligence vol. 95, pp. 103873, 2020.

Allen, G. and Owens, M. The Definitive Guide to SQLite. Apress, USA, 2010.

Anderson Jr, J. W. Corporate governance in brazil: Recent improvements and new challenges. Law & Bus. Rev. Am. vol. 9, pp. 201, 2003.

Basak, S., Kar, S., Saha, S., Khaidem, L., and Dey, S. R. Predicting the direction of stock market prices using tree-based classifiers. The North American Journal of Economics and Finance vol. 47, pp. 552–567, 2019.

Budhiraja, R., Kumar, M., Das, M. K., Bafila, A. S., and Singh, S. A reservoir computing approach for forecasting and regenerating both dynamical and time-delay controlled financial system behavior. Plos one 16 (2): e0246737, 2021.

Bustos, O. and Pomares-Quimbaya, A. Stock market movement forecast: A systematic review. Expert Systems with Applications vol. 156, pp. 113464, 2020.

Cao, J. and Wang, J. Stock price forecasting model based on modified convolution neural network and financial time series analysis. International Journal of Communication Systems vol. 32, pp. e3987, 05, 2019.

Chandola, V., Banerjee, A., and Kumar, V. Anomaly detection: A survey. ACM computing surveys (CSUR) 41 (3): 1–58, 2009.

CVM. Mercado de Valores Mobiliários Brasileiro. Comissão de Valores Mobiliários, Rio de Janeiro, 2019.

Del Ángel, R. G. Financial time series forecasting using artificial neural networks. Revista Mexicana de Economía y Finanzas Nueva Época REMEF 15 (1): 105–122, 2020.

Domingos, S. d. O., de Oliveira, J. F., and de Mattos Neto, P. S. An intelligent hybridization of arima with machine learning models for time series forecasting. Knowledge-Based Systems vol. 175, pp. 72–86, 2019.

Efimov, D., Xu, D., Kong, L., Nefedov, A., and Anandakrishnan, A. Using generative adversarial networks to synthesize artificial financial datasets. In 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), 2019.

Garcia-Molina, H. Database systems: the complete book. Pearson Education India, Upper Saddle River, NJ, 2008.

Guo, Y., Han, S., Shen, C., Li, Y., Yin, X., and Bai, Y. An adaptive svr for high-frequency stock price forecasting. IEEE Access vol. 6, pp. 11397–11404, 2018.

Harris, R. D. Stock markets and development: A re-assessment. European Economic Review 41 (1): 139–146, 1997.

Hu, Z., Zhao, Y., and Khushi, M. A survey of forex and stock price prediction using deep learning. Applied System Innovation 4 (1): 9, 2021.

Koehler, A. B., Snyder, R. D., Ord, J. K., and Beaumont, A. A study of outliers in the exponential smoothing approach to forecasting. International Journal of Forecasting 28 (2): 477–484, 2012.

Lara-Benítez, P., Carranza-García, M., and Riquelme, J. C. An experimental review on deep learning architectures for time series forecasting. International journal of neural systems 31 (03): 2130001, 2021.

Li, A. W. and Bastos, G. S. Stock market forecasting using deep learning and technical analysis: a systematic review. IEEE Access vol. 8, pp. 185232–185242, 2020.

Melo, D. C. and Castro, A. R. Uma nova abordagem para detecção de outliers em séries temporais: estudo de caso em consumo de energia na região amazônica. In Anais do Simpósio Brasileiro de Matemática Aplicada e Computacional 2013. Proceeding Series of the Brazilian Society of Computational and Applied Mathematics 1 (1): 1–4, 2013.

Nison, S. Japanese Candlestick Charting Techniques: A Contemporary Guide to the Ancient Investment Techniques of the Far East. New York Institute of Finance, 2001.

Nti, I. K., Adekoya, A. F., and Weyori, B. A. A systematic review of fundamental and technical analysis of stock market predictions. Artificial Intelligence Review , 2019.

Pellegrini, F. R. Metodologia para implementação de sistemas de previsão de demanda. Mestrado em Engenharia de Produção-Departamento de Engenharia de Produção e Transportes. Porto Alegre-Universidade Federal do Rio Grande do Sul, 2000.

Rafay, A., Suleman, M., and Alim, A. Robust review rating prediction model based on machine and deep learning: Yelp dataset. In 2020 International Conference on Emerging Trends in Smart Technologies (ICETST). IEEE, pp. 8138–8143, 2020.

Rahat, A. M., Kahir, A., and Masum, A. K. M. Comparison of naive bayes and svm algorithm based on sentiment analysis using review dataset. In 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART). IEEE, pp. 266–270, 2019.

Rousis, P. and Papathanasiou, S. Is technical analysis profitable on athens stock exchange? Mega Journal of Business Research vol. 2018, 2018.

Schierholt, K. and Dagli, C. H. Stock market prediction using different neural network classification architectures. In IEEE/IAFE 1996 Conference on Computational Intelligence for Financial Engineering (CIFEr). IEEE, pp. 72–78, 1996.

Sezer, O. B., Gudelek, M. U., and Ozbayoglu, A. M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Applied Soft Computing vol. 90, pp. 106181, 2020.

Sowinska, K. and Madhyastha, P. A tweet-based dataset for company-level stock return prediction. arXiv preprint arXiv:2006.09723 , 2020.

Standard & Poor’s, M. Global industry classification standard. New York: Standard & Poor’s, 2020.

Thomann, A. Multi-asset scenario building for trend-following trading strategies. Annals of Operations Research 299 (1): 293–315, 2021.

Thomaz, P. S., de Mattos, V. L. D., Nakamura, L. R., et al. Modeling volatility’s long-range persistence and asymmetry effect of bradesco bank stock prices using garch models. International Journal of Development Research 11 (03): 45532–45543, 2021.

Upadhyay, A., Bandyopadhyay, G., and Dutta, A. Forecasting stock performance in indian market using multinomial logistic regression. Journal of Business Studies Quarterly 3 (3): 16, 2012.

Vachhani, H., Obiadat, M. S., Thakkar, A., Shah, V., Sojitra, R., Bhatia, J., and Tanwar, S. Machine learning based stock market analysis: A short survey. In International Conference on Innovative Data Communication Technologies and Application. Springer, pp. 12–26, 2019.

Wang, J. The analysis of the financial market in china. Academic Journal of Business & Management 3 (2), 2021.

Zhang, E. Forecasting financial performance of companies for stock valuation. Stanford Projects Spring 2021 , 2021.

Downloads

Published

2022-08-15

How to Cite

Corrêa Cardoso, F., Valverde Malska, J. A., Ramiro, P. J., Lucca, G., Nunes Borges, E., Leite Dias de Mattos, V., & Alceste Berri, R. (2022). BovDB: a data set of stock prices of all companies in B3 from 1995 to 2020. Journal of Information and Data Management, 13(1). https://doi.org/10.5753/jidm.2022.2345

Issue

Section

Dataset Showcase Workshop 2021 - Extended Papers