Benchmarking Nonstationary Time Series Prediction
Resumo
The prediction of time series has gained increasingly more attention among researchers since it is a crucial aspect of decision-making activities. Unfortunately, most time series prediction methods assume the property of stationarity, i.e., statistical properties do not change over time. In practice, it is the exception and not the rule in most real datasets. Several transformation methods were designed to treat nonstationarity in time series. In this context, nonstationary time series prediction is challenging since it demands knowledge of both data transformation and prediction methods. Since there are no silver bullets, it leads to exploring a large number of data transformation and prediction method combinations for building prediction setups. However, selecting a prediction setup that is appropriate to a particular time series and application is not a simple task. Benchmarking of different candidate combinations helps this selection. This work contributes by providing a review and experimental analysis of transformation methods and a systematic framework (TSPred) for benchmarking and selecting prediction setups for nonstationary time series. Suitable nonstationary time series transformation methods provided improvements of more than 30% in prediction accuracy for half of the evaluated time series. They improved the prediction by more than 95% for 10% of the time series. The features provided by TSPred are also shown to be competitive regarding prediction accuracy. Furthermore, the adoption of a validation phase during model training enables the selection of suitable transformation methods.
Palavras-chave:
benchmark, time series, prediction
Referências
Adalberto Andrade, Rebecca Salles, Flavio Carvalho, Eduardo Bezerra da Silva, Jorge Soares, Cristina Souza, Pedro Henrique Gonzalez, and Eduardo Ogasawara. Uso de ciência de dados para predição do consumo de fertilizantes no Brasil. In Anais do XIV Brazilian e-Science Workshop, pages 9–16. SBC, 2020.
C. Cheng, A. Sa-Ngasoongsong, O. Beyca, T. Le, H. Yang, Z. Kong, and S.T.S. Bukkapatnam. Time series forecasting for nonlinear and non-stationary processes: A review and comparative study. IIE Transactions (Institute of Industrial Engineers), 47(10):1053–1071, 2015.
Damodar Gujarati. Basic Econometrics. McGraw-Hill/Irwin, Boston; Montreal, 4 edition, March 2002. ISBN 978-0-07-247852-5.
Jiawei Han, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann, Haryana, India; Burlington, MA, 3 edition, July 2011. ISBN 978-93-80931-91-3.
Balthazar Paixão, Lais Baroni, Marcel Pedroso, Rebecca Salles, Luciana Escobar, Carlos de Sousa, Raphael de Freitas Saldanha, Jorge Soares, Rafaelli Coutinho, Fabio Porto, et al. Estimation of covid-19 under-reporting in the brazilian states through sari. New Generation Computing, pages 1–23, 2021.
Arthur Ronald, Rebecca Salles, Kele Belloze, Dayse Pastore, and Eduardo Ogasawara. Modelo autorregressivo de integração adaptativa. In Anais do XXXIV Simpósio Brasileiro de Banco de Dados, pages 175–180. SBC, 2019.
Rebecca Salles, Patricia Mattos, Ana-Maria Dubois Iorgulescu, Eduardo Bezerra, Leonardo Lima, and Eduardo Ogasawara. Evaluating temporal aggregation for predicting the sea surface temperature of the Atlantic Ocean. Ecological Informatics, 36:94–105, November 2016. ISSN 1574-9541. doi: 10.1016/j.ecoinf.2016.10.004.
Rebecca Salles, Laura Assis, Gustavo Paiva Guedes, Eduardo Bezerra, Fabio Porto, and Eduardo S. Ogasawara. A framework for benchmarking machine learning methods using linear models for univariate time series prediction. In 2017 International Joint Conference on Neural Networks, IJCNN 2017, Anchorage, AK, USA, May 14-19, 2017, pages 2338–2345, 2017. doi: 10.1109/IJCNN.2017.7966139.
Rebecca Salles, Kele Belloze, Fabio Porto, Pedro H. Gonzalez, and Eduardo Ogasawara. Nonstationary time series transformation methods: An experimental review. Knowledge-Based Systems, 164:274–291, January 2019. ISSN 0950-7051. doi: 10.1016/j.knosys.2018.10.041.
Rebecca Salles, Luciana Escobar, Lais Baroni, Roccio Zorrilla, Artur Ziviani, Vinicius Kreischer, Flavia Delicato, Paulo F Pires, Luciano Maia, Rafaelli Coutinho, et al. Harbinger: Um framework para integração e análise de métodos de detecção de eventos em séries temporais. In Anais do XXXV Simpósio Brasileiro de Bancos de Dados, pages 73–84. SBC, 2020.
Rebecca Pontes Salles and Eduardo Ogasawara. TSPred: Functions for Benchmarking Time Series Prediction Prediction. Technical report, https://CRAN.R-project.org/package=TSPred, 2018.
W. a Yang and I. b Zurbenko. Nonstationarity. Wiley Interdisciplinary Reviews: Computational Statistics, 2(1):107–115, 2010.
C. Cheng, A. Sa-Ngasoongsong, O. Beyca, T. Le, H. Yang, Z. Kong, and S.T.S. Bukkapatnam. Time series forecasting for nonlinear and non-stationary processes: A review and comparative study. IIE Transactions (Institute of Industrial Engineers), 47(10):1053–1071, 2015.
Damodar Gujarati. Basic Econometrics. McGraw-Hill/Irwin, Boston; Montreal, 4 edition, March 2002. ISBN 978-0-07-247852-5.
Jiawei Han, Micheline Kamber, and Jian Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann, Haryana, India; Burlington, MA, 3 edition, July 2011. ISBN 978-93-80931-91-3.
Balthazar Paixão, Lais Baroni, Marcel Pedroso, Rebecca Salles, Luciana Escobar, Carlos de Sousa, Raphael de Freitas Saldanha, Jorge Soares, Rafaelli Coutinho, Fabio Porto, et al. Estimation of covid-19 under-reporting in the brazilian states through sari. New Generation Computing, pages 1–23, 2021.
Arthur Ronald, Rebecca Salles, Kele Belloze, Dayse Pastore, and Eduardo Ogasawara. Modelo autorregressivo de integração adaptativa. In Anais do XXXIV Simpósio Brasileiro de Banco de Dados, pages 175–180. SBC, 2019.
Rebecca Salles, Patricia Mattos, Ana-Maria Dubois Iorgulescu, Eduardo Bezerra, Leonardo Lima, and Eduardo Ogasawara. Evaluating temporal aggregation for predicting the sea surface temperature of the Atlantic Ocean. Ecological Informatics, 36:94–105, November 2016. ISSN 1574-9541. doi: 10.1016/j.ecoinf.2016.10.004.
Rebecca Salles, Laura Assis, Gustavo Paiva Guedes, Eduardo Bezerra, Fabio Porto, and Eduardo S. Ogasawara. A framework for benchmarking machine learning methods using linear models for univariate time series prediction. In 2017 International Joint Conference on Neural Networks, IJCNN 2017, Anchorage, AK, USA, May 14-19, 2017, pages 2338–2345, 2017. doi: 10.1109/IJCNN.2017.7966139.
Rebecca Salles, Kele Belloze, Fabio Porto, Pedro H. Gonzalez, and Eduardo Ogasawara. Nonstationary time series transformation methods: An experimental review. Knowledge-Based Systems, 164:274–291, January 2019. ISSN 0950-7051. doi: 10.1016/j.knosys.2018.10.041.
Rebecca Salles, Luciana Escobar, Lais Baroni, Roccio Zorrilla, Artur Ziviani, Vinicius Kreischer, Flavia Delicato, Paulo F Pires, Luciano Maia, Rafaelli Coutinho, et al. Harbinger: Um framework para integração e análise de métodos de detecção de eventos em séries temporais. In Anais do XXXV Simpósio Brasileiro de Bancos de Dados, pages 73–84. SBC, 2020.
Rebecca Pontes Salles and Eduardo Ogasawara. TSPred: Functions for Benchmarking Time Series Prediction Prediction. Technical report, https://CRAN.R-project.org/package=TSPred, 2018.
W. a Yang and I. b Zurbenko. Nonstationarity. Wiley Interdisciplinary Reviews: Computational Statistics, 2(1):107–115, 2010.
Publicado
04/10/2021
Como Citar
SALLES, Rebecca Pontes; OGASAWARA, Eduardo; GONZÁLEZ, Pedro.
Benchmarking Nonstationary Time Series Prediction. In: CONCURSO DE TESES E DISSERTAÇÕES (CTDBD) - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 36. , 2021, Rio de Janeiro.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.
p. 177-182.
DOI: https://doi.org/10.5753/sbbd_estendido.2021.18182.