UniTED: A Unified Time Series Event Detection Repository

  • Janio Lima Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ) / Petróleo Brasileiro S.A. (Petrobras)
  • Hélio Castro Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)
  • Luiz Oliveira Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ) / Petróleo Brasileiro S.A. (Petrobras)
  • Ellen Paixão Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)
  • Lais Baroni Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)
  • Rebecca Salles INRIA
  • Ricardo Vargas Petróleo Brasileiro S.A. (Petrobras)
  • Eduardo Ogasawara Centro Federal de Educação Tecnológica Celso Suckow da Fonseca (CEFET/RJ)

Resumo


Event detection in time series is essential for numerous real-world applications, from monitoring industrial systems to identifying health anomalies. Public annotated datasets are crucial for benchmarking, training, and validating detection models. Despite recent advances in the field, there is a lack of a standardized and unified repository for evaluating different event types, which limits progress in reproducibility, comparability, and model development. This paper presents the UniTED, a Unified Event Detection Dataset for time series. UniTED consolidates annotated series from diverse domains and offers a common format and protocol for evaluation. The repository supports three event types: anomalies, change points, and motifs. UniTED fosters reusability and reproducibility, contributing to improved performance assessment and model generalization across data analysis tasks. However, existing datasets have limitations, including poor standardization, a lack of annotation guidelines, limited support for different event types, and difficulties in automating performance evaluation. UniTED presents a harmonized ETL process, label and annotation conventions, and an open-source implementation. Three use cases are presented to demonstrate the applicability of the dataset.

Palavras-chave: Time Series Event Detection, Anomaly Detection, Change Point Detection, Motif Discovery

Referências

Ahmad, S., Lavin, A., Purdy, S., and Agha, Z. (2017). Unsupervised real-time anomaly detection for streaming data. Neurocomputing, 262:134 – 147.

Chandola, V., Banerjee, A., and Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys, 41(3).

Duraj, A., Szczepaniak, P. S., and Sadok, A. (2025). Detection of anomalies in data streams using the lstm-cnn model. Sensors, 25(5).

Han, J., Kamber, M., and Pei, J. (2012). Data Mining: Concepts and Techniques. Elsevier.

Lima, J., Tavares, L. G., Pacitti, E., Ferreira, J. E., Santos, I., Siqueira, I. G., Carvalho, D., Porto, F., Coutinho, R., and Ogasawara, E. (2024). Online Event Detection in Streaming Time Series: Novel Metrics and Practical Insights. In Proceedings of the IJCNN 2024.

Lomio, F., Baselga, D. M., Moreschini, S., Huttunen, H., and Taibi, D. (2020). RARE: A labeled dataset for cloud-native memory anomalies. In MaLTeSQuE 2020, pages 19 – 24.

Moody, G. and Mark, R. (2001). The impact of the mit-bih arrhythmia database. IEEE Engineering in Medicine and Biology Magazine, 20(3):45–50.

Moritz, S., Rehbach, F., Chandrasekaran, S., Rebolledo, M., and Bartz-Beielstein, T. (2018). GECCO Industrial Challenge 2018 Dataset. Technical report, [link].

Ogasawara, E., Salles, R., Porto, F., and Pacitti, E. (2025). Event Detection in Time Series. Synthesis Lectures on Data Management. Springer Nature Switzerland, Cham, 1 edition.

Salles, R., Escobar, L., Baroni, L., Zorrilla, R., Ziviani, A., Kreischer, V., Delicato, F., Pires, P. F., Maia, L., Coutinho, R., Assis, L., and Ogasawara, E. (2020). Harbinger: Um framework para integração e análise de métodos de detecção de eventos em séries temporais. In Anais do Simpósio Brasileiro de Banco de Dados (SBBD), pages 73–84. SBC.

Salles, R., Lima, J., Reis, M., Coutinho, R., Pacitti, E., Masseglia, F., Akbarinia, R., Chen, C., Garibaldi, J., Porto, F., and Ogasawara, E. (2024). SoftED: Metrics for soft evaluation of time series event detection. Computers and Industrial Engineering, 198.

Vargas, R. E. V., Munaro, C. J., Ciarelli, P. M., Medeiros, A. G., do Amaral, B. G., Barrionuevo, D. C., de Araújo, J. C. D., Ribeiro, J. L., and aes, L. P. M. (2019). A realistic and public dataset with rare undesirable real events in oil wells. Journal of Petroleum Science and Engineering, 181.

webscope (2015). S5 - A Labeled Anomaly Detection Dataset, version 1.0. Technical report, [link].

Wenig, P., Schmidl Sebastian, S., and Papenbrock, T. (2022). TimeEval: A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms. Proceedings of the VLDB Endowment, 15(12):3678 – 3681.

Wu, R. and Keogh, E. J. (2023). Current Time Series Anomaly Detection Benchmarks are Flawed and are Creating the Illusion of Progress. IEEE Transactions on Knowledge and Data Engineering, 35(3):2421 – 2429.
Publicado
29/09/2025
LIMA, Janio; CASTRO, Hélio; OLIVEIRA, Luiz; PAIXÃO, Ellen; BARONI, Lais; SALLES, Rebecca; VARGAS, Ricardo; OGASAWARA, Eduardo. UniTED: A Unified Time Series Event Detection Repository. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 19. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 1-8. ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2025.247972.