Comparing Gradient Boosting Algorithms to Forecast Sales in Retail

  • Ana Clara Chaves Sousa Universidade Federal da Paraíba (UFPB)
  • Thaís Gaudêncio do Rêgo Universidade Federal da Paraíba (UFPB)
  • Yuri de Almeida Malheiros Barbosa Universidade Federal da Paraíba (UFPB)
  • Telmo de Menezes e Silva Filho University of Bristol


The availability of data and the increased processing power of computers have made it easier to make decisions based on data, specially with Artificial Intelligence. One area where AI is widely applicable in companies is Supply Chain Management, particularly in demand forecasting. This paper aims to forecast sales for a company in the Cosmetic, Fragrance, and Toiletry market. Data from 2019 to 2023 were used from two different sales channel. To predict the demand, three Gradient Boosting algorithms (CatBoost, LightGBM, and XGBoost) were compared, and forecasts were made for three different time horizons (next period, five and ten periods ahead). After the experiments, LightGBM showed more stability compared to the other models.

Palavras-chave: Machine Learning, Forecasting, Gradient Boosting


Andrade, L. and Cunha, C. B. (2022). Disaggregated retail forecasting: A gradient boosting approach. Available at SSRN 4129889.

Baržić, M., Munitić, N.-F., Bronić, F., Jelić, L., and Lešić, V. (2022). Forecasting sales in retail with xgboost and iterated multi-step ahead method. In 2022 International Conference on Smart Systems and Technologies (SST), pages 153–158. IEEE.

Bergmeir, C. and Benítez, J. M. (2012). On the use of cross-validation for time series predictor evaluation. Information Sciences, 191:192–213.

Cox, J., Blackstone, J., Spencer, M., Production, A., and Society, I. C. (1995). APICS Dictionary. American Production and Inventory Control Society.

Hasan, M. R., Kabir, M. A., Shuvro, R. A., and Das, P. (2022). A comparative study on forecasting of retail sales. arXiv preprint arXiv:2203.06848.

Li, J. (2022). A feature engineering approach for tree-based machine learning sales forecast, optimized by a genetic algorithm based sales feature framework. In 2022 5th International Conference on Artificial Intelligence and Big Data (ICAIBD), pages 133–139. IEEE.

Lopes, G. (2022). The wisdom of crowds in forecasting at high-frequency for multiple time horizons: A case study of the brazilian retail sales. Brazilian Review of Finance, 20(2):77–115.

Provost, F. and Fawcett, T. (2013a). Data science and its relationship to big data and data-driven decision making. Big data, 1(1):51–59.

Provost, F. and Fawcett, T. (2013b). Data Science for Business. O’Reilly.

Schoenherr, T. and Speier-Pero, C. (2015). Data science, predictive analytics, and big data in supply chain management: Current state and future potential. Journal of Business Logistics, 36(1):120–132.

Seyedan, M. and Mafakheri, F. (2020). Predictive big data analytics for supply chain demand forecasting: methods, applications, and research opportunities. Journal of Big Data, 7(1):1–22.

Spiliotis, E. et al. (2022). Decision trees for time-series forecasting. Foresight: The International Journal of Applied Forecasting, (64):30–44.

Wu, H., Xu, J., Wang, J., and Long, M. (2021). Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34:22419–22430.

Zhou, T., Zhu, J., Wang, X., Ma, Z., Wen, Q., Sun, L., and Jin, R. (2022). Tree-drnet: A robust deep model for long term time series forecasting. arXiv preprint arXiv:2206.12106.
SOUSA, Ana Clara Chaves; DO RÊGO, Thaís Gaudêncio; BARBOSA, Yuri de Almeida Malheiros; MENEZES E SILVA FILHO, Telmo de. Comparing Gradient Boosting Algorithms to Forecast Sales in Retail. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 20. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 596-609. ISSN 2763-9061. DOI: