Forecasting future corn and soybean prices: an analysis of the use of textual information to enrich time-series.

  • Ivan José dos Reis Filho Universidade do Estado de Minas Gerais, Universidade de São Paulo
  • Guilherme Bittencourt Correa Universidade do Estado de Minas Gerais
  • Guilherme Mendonça Freire Universidade de São Paulo
  • Solange Oliveira Rezende Universidade de São Paulo


The commodities corn and soybean are products consumed on a large scale in the world. Fluctuations in market prices have far-reaching effects on consumers, farmers, and grain processors. Thus, forecasting the prices of these grains has attracted significant attention from researchers. Forecasting models generally use quantitative time-series data. However, external qualitative factors can influence data in time-series, such as political events, economic crises, and the foreign exchange market. This information is not explicit in the time-series data, and these factors can influence the prediction of the variable values. Textual data extracted from news, forums, and social networks can be a source of knowledge about external factors and potentially useful for time-series forecasting models. Some studies present text mining techniques to combine textual data with time-series. However, the existing representations have some limitations, such as the curse of dimensionality and ineffective attributes. This work applies pre-processing methods in time-series and uses representations combined with textual data to predict the future price of corn and soybeans. The results indicate that the methods used can be an alternative to improve forecasting performance in regression tasks.

Palavras-chave: Time-series, Text mining, Forecasting, agricultural commodities


Adanacioglu, H., Yercan, M., et al. An analysis of tomato prices at wholesale level in turkey: an application of sarima model. Custos e@ gronegócio on line 8 (4): 52–75, 2012.

Aggarwal, C. C. and Zhai, C. Mining text data. Springer Science & Business Media, 2012.

Alameer, Z., Abd Elaziz, M., Ewees, A. A., Ye, H., and Jianhua, Z. Forecasting gold price fluctuations using improved multilayer perceptron neural network and whale optimization algorithm. Resources Policy vol. 61, pp. 250–260, 2019.

Baruník, J. and Malinská, B. Forecasting the term structure of crude oil futures prices with neural networks. Applied energy vol. 164, pp. 366–379, 2016.

Chen, H.-H., Chen, M., and Chiu, C.-C. The integration of artificial neural networks and text mining to forecast gold futures prices. Communications in Statistics-Simulation and Computation 45 (4): 1213–1225, 2016.

Cortazar, G., Millard, C., Ortega, H., and Schwartz, E. S. Commodity price forecasts, futures prices, and pricing models. Management Science 65 (9): 4141–4155, 2019.

Crone, S. F. and Koeppel, C. Predicting exchange rates with sentiment indicators: An empirical evaluation using text mining and multilayer perceptrons. In 2014 IEEE Conference on Computational Intelligence for Financial Engineering & Economics (CIFEr). IEEE, pp. 114–121, 2014.

Darekar, A. and Reddy, A. Predicting market price of soybean in major india studies through arima model. Journal of Food Legumes 30 (2): 73–76, 2017.

Das, S. P. and Padhy, S. A novel hybrid model using teaching-learning-based optimization and a support vector machine for commodity futures index forecasting. International Journal of Machine Learning and Cybernetics 9 (1): 97–111, 2018.

Drucker, H., Burges, C. J., Kaufman, L., Smola, A. J., and Vapnik, V. Support vector regression machines. In Advances in neural information processing systems. pp. 155–161, 1997.

Fung, G. P. C., Yu, J. X., and Lam, W. Stock prediction: Integrating text mining approach using real-time news. In 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings. IEEE, pp. 395–402, 2003.

Jiang, F., He, J., and Zeng, Z. Pigeon-inspired optimization and extreme learning machine via wavelet packet analysis for predicting bulk commodity futures prices. Science China Information Sciences 62 (7): 70204, 2019.

Puchalsky, W., Ribeiro, G. T., da Veiga, C. P., Freire, R. Z., and dos Santos Coelho, L. Agribusiness time series forecasting using wavelet neural networks and metaheuristic optimization: An analysis of the soybean sack price and perishable products demand. International Journal of Production Economics vol. 203, pp. 174–189, 2018.

Wang, B., Huang, H., and Wang, X. A novel text mining approach to financial time series forecasting. Neurocomputing vol. 83, pp. 136–145, 2012.

Wang, C. and Gao, Q. High and low prices prediction of soybean futures with lstm neural network. In 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS). IEEE, pp. 140–143, 2018.

Wang, D., Yue, C., Wei, S., and Lv, J. Performance analysis of four decomposition-ensemble models for one-day-ahead agricultural commodity futures price forecasting. Algorithms 10 (3): 108, 2017.

Zhang, D., Zang, G., Li, J., Ma, K., and Liu, H. Prediction of soybean price in china using qr-rbf neural network model. Computers and Electronics in Agriculture vol. 154, pp. 10–17, 2018.
DOS REIS FILHO, Ivan José; CORREA, Guilherme Bittencourt; FREIRE, Guilherme Mendonça; REZENDE, Solange Oliveira. Forecasting future corn and soybean prices: an analysis of the use of textual information to enrich time-series.. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 8. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 113-120. ISSN 2763-8944. DOI: