The impact of window size on univariate time series forecasting using machine learning
Resumo
In the task of modeling time series prediction problems, the window size (w) is a hyperparameter that defines the amount of time units that will be present in each example to be fed into a learning model. This hyperparameter is particularly important due to the need to make the learning model understand both long-term and short-term trends, as well as seasonal patterns, without making it sensitive to random fluctuations. In this article, our aim is to understand the effect that the window size has on the results of machine learning algorithms in univariate time series prediction problems. To achieve this goal, we used 40 time series from two distinct domains, conducting trainings with variations in the window size using four types of machine learning algorithms: Bagging, Boosting, Stacking, and a Recurrent Neural Network architecture. As a result, we observed that increasing the window size can lead to an improvement in the evaluation metric values until reaching a stabilization scenario, where further increasing the window size does not yield better predictions. The aforementioned stabilization occurred in both studied domains only when w values exceeded 100 time steps. We also observed that Recurrent Neural Network architectures do not outperform ensemble models in several univariate time series prediction scenarios.
Referências
Abolghasemi, M., Beh, E., Tarr, G., and Gerlach, R. Demand forecasting in supply chain: The impact of demand volatility in the presence of promotion. Computers & Industrial Engineering vol. 142, pp. 106380, 2020.
Aghabozorgi, S., Shirkhorshidi, A. S., and Wah, T. Y. Time-series clustering–a decade review. Information systems vol. 53, pp. 16–38, 2015.
Azlan, A., Yusof, Y., and Mohsin, M. F. M. Determining the impact of window length on time series forecasting using deep learning. International Journal of Advanced Computer Research 9 (44): 260–267, 2019.
Bergström, C. and Hjelm, O. Impact of time steps on stock market prediction with lstm, 2019.
Bomfim, R., Pei, S., Shaman, J., Yamana, T., Makse, H. A., Andrade Jr, J. S., Lima Neto, A. S., and Furtado, V. Predicting dengue outbreaks at neighbourhood level using human mobility in urban areas. Journal of the Royal Society Interface 17 (171): 20200691, 2020.
Breiman, L. Random forests. Machine learning vol. 45, pp. 5–32, 2001.
Caminha, C., Furtado, V., Pinheiro, V., and Ponte, C. Graph mining for the detection of overcrowding and waste of resources in public transport. Journal of Internet Services and Applications 9 (1): 1–11, 2018.
Cheng, H., Tan, P.-N., Gao, J., and Scripps, J. Multistep-ahead time series prediction. In Advances in Knowledge Discovery and Data Mining: 10th Pacific-Asia Conference, PAKDD 2006, Singapore, April 9-12, 2006. Proceedings 10. Springer, Singapore, pp. 765–774, 2006.
Cover, T. and Hart, P. Nearest neighbor pattern classification. IEEE transactions on information theory 13 (1): 21–27, 1967.
De Gooijer, J. G. and Hyndman, R. J. 25 years of time series forecasting. International journal of forecasting 22 (3): 443–473, 2006.
Drucker, H., Burges, C. J., Kaufman, L., Smola, A., and Vapnik, V. Support vector regression machines. Advances in neural information processing systems vol. 9, pp. 155–161, 1996.
Fawaz, H. I., Forestier, G., Weber, J., Idoumghar, L., and Muller, P.-A. Adversarial attacks on deep neural networks for time series classification. In 2019 International Joint Conference on Neural Networks (IJCNN). IEEE, Budapest, pp. 1–8, 2019.
Freund, Y., Schapire, R. E., et al. Experiments with a new boosting algorithm. In icml. Vol. 96. Citeseer, New Jersey, pp. 148–156, 1996.
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Annals of statistics 29 (5): 1189–1232, 2001.
Galton, F. Natural inheritance. Vol. 42. Macmillan, London, 1889.
Hamzaçebi, C., Akay, D., and Kutay, F. Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting. Expert systems with applications 36 (2): 3839–3844, 2009.
Hochreiter, S. and Schmidhuber, J. Long short-term memory. Neural computation 9 (8): 1735–1780, 1997.
Hoerl, A. E. and Kennard, R. W. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12 (1): 55–67, 1970.
Huber, J. and Stuckenschmidt, H. Daily retail demand forecasting using machine learning with emphasis on calendric special days. International Journal of Forecasting 36 (4): 1420–1438, 2020.
Kreinovich, V., Nguyen, H. T., and Ouncharoen, R. How to estimate forecasting quality: A system-motivated derivation of symmetric mean absolute percentage error (smape) and other similar characteristics. Departmental Technical Reports (CS), 2014.
Lim, B. and Zohren, S. Time-series forecasting with deep learning: a survey. Philosophical Transactions of the Royal Society A 379 (2194): 20200209, 2021.
Liu, Y., Wang, Z., Yu, X., Chen, X., and Sun, M. Memory-based transformer with shorter window and longer horizon for multivariate time series forecasting. Pattern Recognition Letters vol. 160, pp. 26–33, 2022.
Lovrić, M., Milanović, M., and Stamenković, M. Algoritmic methods for segmentation of time series: An overview. Journal of Contemporary Economic and Business Issues 1 (1): 31–53, 2014.
Makridakis, S. Accuracy measures: theoretical and practical concerns. International journal of forecasting 9 (4): 527–529, 1993.
Munir, M., Siddiqui, S. A., Dengel, A., and Ahmed, S. Deepant: A deep learning approach for unsupervised anomaly detection in time series. Ieee Access vol. 7, pp. 1991–2005, 2018.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. Scikit-learn: Machine learning in python. the Journal of machine Learning research vol. 12, pp. 2825–2830, 2011.
Ponte, C., Carmona, H. A., Oliveira, E. A., Caminha, C., Lima, A. S., Andrade Jr, J. S., and Furtado, V. Tracing contacts to evaluate the transmission of covid-19 from highly exposed individuals in public transportation. Scientific Reports 11 (1): 24443, 2021.
Ponte, C., Melo, H. P. M., Caminha, C., Andrade Jr, J. S., and Furtado, V. Traveling heterogeneity in public transportation. EPJ Data Science 7 (1): 1–10, 2018.
Salles, R., Belloze, K., Porto, F., Gonzalez, P. H., and Ogasawara, E. Nonstationary time series transformation methods: An experimental review. Knowledge-Based Systems vol. 164, pp. 274–291, 2019.
Shynkevich, Y., McGinnity, T. M., Coleman, S. A., Belatreche, A., and Li, Y. Forecasting price movements using technical indicators: Investigating the impact of varying input window length. Neurocomputing vol. 264, pp. 71–88, 2017.
Taieb, S. B., Bontempi, G., Atiya, A. F., and Sorjamaa, A. A review and comparison of strategies for multi-step ahead time series forecasting based on the nn5 forecasting competition. Expert systems with applications 39 (8): 7067–7083, 2012.
Ughi, R., Lomurno, E., and Matteucci, M. Two steps forward and one behind: Rethinking time series forecasting with deep learning. arXiv preprint arXiv:2304.04553, 2023.
Wolpert, D. H. Stacked generalization. Neural networks 5 (2): 241–259, 1992.