Enhancing ARIMA with Residual Kalman Filtering for Robust Time Series Prediction

  • Mouglas Eugênio Nasário Gomes UFPE
  • Bárbara Silva Morais UFPE
  • Paulo Salgado Gomes de-Mattos-Neto UFPE

Abstract


This paper proposes the ARIMA+KF Residual, a hybrid model that preserves ARIMA’s transparency while correcting its residual through a gradient-trained Kalman Filter. After automatic order selection via the Box–Jenkins methodology, the error is treated as a latent state, and the filtered residual is added to the linear forecast. Evaluated on four benchmark series (Airline, Colorado River, Sunspot, and Lynx) with a rolling one-step procedure, the method reduces the MSE by up to 84% compared with SVR, MLP, LSTM, and ARIMA–RNA/SVM hybrids, delivering high accuracy with minimal tuning effort.

References

Box, G. E., Jenkins, G. M., Reinsel, G. C., and Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.

Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273–297.

de O. Santos Júnior, D. S., de Oliveira, J. F., and de Mattos Neto, P. S. (2019). An intelligent hybridization of arima with machine learning models for time series forecasting. Knowledge-Based Systems, 175:72–86.

de Oliveira, J. F. and Ludermir, T. B. (2016). A hybrid evolutionary decomposition system for time series forecasting. Neurocomputing, 180:27–34. Progress in Intelligent Systems Design.

Galton, F. (1863). Meteorographica, or, methods of mapping the weather: Illustrated by upwards of 600 printed and lithographed diagrams referring to the weather of a large part of Europe, during the month of December 1861. Macmillan.

Graves, A. and Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18(5-6):602–610.

Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., and Schmidhuber, J. (2017). Lstm: A search space odyssey. IEEE transactions on neural networks and learning systems, 28(10):2222–2232.

Haykin, S. S. et al. (2009). Neural networks and learning machines/Simon Haykin. New York: Prentice Hall,.

Hochreiter, S. and Schmidhuber, J. (1997). Lstm can solve hard long time lag problems. In Advances in neural information processing systems, pages 473–479.

Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the national academy of sciences, 79(8):2554–2558.

Hudson’s Bay Company (2025). Annual canadian lynx trappings in the mackenzie river district (1821–1934). [link]. Yearly numbers of lynx pelts traded by the Hudson’s Bay Company, widely used as an ecological time-series benchmark. Accessed on 20 May 2025.

Kalman, R. (1960). E. 1960. a new approach to linear filtering and prediction problems. Transactions of the ASME–Journal of Basic Engineering, 82:35–45.

McClure, N. (2017). TensorFlow machine learning cookbook. Packt Publishing Ltd.

McCulloch, W. S. and Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4):115–133.

Pai, P.-F. and Lin, C.-S. (2005). A hybrid arima and support vector machines model in stock price forecasting. Omega, 33(6):497–505.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. (2011). Scikit-learn: Machine learning in python. Journal of machine learning research, 12(Oct):2825–2830.

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning representations by back-propagating errors. nature, 323(6088):533.

SILSO World Data Center (2025). International sunspot number: Yearly means (1700–1987). [link]. Anual mean Wolf numbers produced by the Royal Observatory of Belgium––Solar Influences Data Analysis Center (SILSO). Accessed on 20 May 2025.

Sousa, J. C., Jorge, H. M., and Neves, L. P. (2014). Short-term load forecasting based on support vector regression and load profiling. International Journal of Energy Research, 38(3):350–362.

UK Civil Aviation Authority (2025). International airline passengers dataset (1949–1960). [link]. Monthly totals of international airline passengers (in thousands). Originally published in Box and Jenkins, *Time Series Analysis: Forecasting and Control*. Accessed on 20 May 2025.

U.S. Bureau of Reclamation (2025). Natural flow of the colorado river at lees ferry, arizona (1911–1972). [link]. Monthly natural-flow estimates (in cubic feet per second) used extensibly as a hydrological time-series benchmark. Originally compiled by the U.S. Bureau of Reclamation; accessed on 20 May 2025.

Werbos, P. (1974). Beyond regression:”new tools for prediction and analysis in the behavioral sciences. Ph. D. dissertation, Harvard University.

Zhang, G. (2003). Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159–175.

Zhang, Y. and Meng, G. (2023). Simulation of an adaptive model based on aic and bic arima predictions. In Journal of Physics: Conference Series, volume 2449, page 012027. IOP Publishing.
Published
2025-09-29
GOMES, Mouglas Eugênio Nasário; MORAIS, Bárbara Silva; DE-MATTOS-NETO, Paulo Salgado Gomes. Enhancing ARIMA with Residual Kalman Filtering for Robust Time Series Prediction. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 25-36. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2025.11774.