Computational Modeling and Artificial Intelligence for Infectious Disease Outbreak Risk Classification in Maceió, Alagoas, Brazil

  • Samila Raphaela de Oliveira IFAL
  • Victor Luan de Lima Lemos IFAL
  • Tarsis Marinho de Souza IFAL
  • Cledja Karina Rolim da Silva IFAL

Resumo


Infectious diseases remain a major public health challenge and require effective strategies for surveillance, prevention, and control. In Brazil, despite the availability of epidemiological surveillance systems such as SINAN, the occurrence of outbreaks still reveals limitations in anticipating critical events. In this context, the early identification of risk patterns is essential to support timely public health action. This study analyzes and compares machine learning models for classifying outbreak risk using SINAN data from Maceió, Alagoas, with emphasis on five diseases: Hepatitis, Meningitis, Pertussis, Varicella, and Exanthematic Diseases. The proposed approach models risk as three levels (Normal, Attention, Outbreak) to support more interpretable and operationally useful surveillance outcomes.

Referências

Ahmad, G. N. et al. (2022). Efficient medical diagnosis of human heart diseases using machine learning techniques with and without gridsearchcv. IEEE Access, 10:80151–80173.

Antunes, J. L. F. and Cardoso, M. R. A. (2015). Uso da análise de séries temporais em estudos epidemiológicos. Epidemiologia e Serviços de Saúde, 24(3):565–576.

Balcan, D. et al. (2009). Multiscale mobility networks and the spatial spreading of infectious diseases. Proceedings of the National Academy of Sciences, 106(51):21484–21489.

Borges, P. K. d. O. et al. (2024). Impacto da covid-19 sobre doenças de notificação compulsória: um estudo de série temporal. Revista da Escola de Enfermagem da USP, 58:e20240098.

Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

Breunig, M. M. et al. (2000). Lof: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 93–104.

Du, L. and Pang, Y. (2021). A novel data-driven methodology for influenza outbreak detection and prediction. Scientific Reports, 11(1):13275.

Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Statistics, pages 1189–1232.

Gao, S. et al. (2024). Early detection of disease outbreaks and non-outbreaks using incidence data. arXiv preprint arXiv:2404.08893.

Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2 edition.

Jones, K. E. et al. (2008). Global trends in emerging infectious diseases. Nature, 451(7181):990–993.

Koike, M. (2025). Datasus: Uma ferramenta essencial para a saúde pública no brasil. Arquivos Brasileiros de Cardiologia, 122(2):e20250123.

Lima, E. C. A., Oliveira, J. P., and Santos, M. R. (2021). Qualidade da informação no sistema de informação de agravos de notificação: uma revisão integrativa. Cadernos de Saúde Pública, 37(6):e00123420.

Liu, F. T., Ting, K. M., and Zhou, Z.-H. (2008). Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining, pages 413–422.

Ludermir, T. B. (2021). Inteligência artificial e aprendizado de máquina: estado atual e tendências. Estudos Avançados, 35:85–94.

Maia, D. A. B. et al. (2019). Avaliação da implantação do sistema de informação de agravos de notificação em pernambuco, 2014. Epidemiologia e Serviços de Saúde, 28:e2018187.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.

Rocha, M. S., Bartholomay, P., et al. (2020). Notifiable diseases information system (sinan): main characteristics of notification and data analysis related to tuberculosis. Epidemiologia e Serviços de Saúde, 29(1):e2019017.

Santangelo, O. E. et al. (2023). Machine learning and prediction of infectious diseases: a systematic review. Machine Learning and Knowledge Extraction, 5(1):175–198.

Scursone, G. F. et al. (2025). Hyperparameter optimization of xgboost on air pollution and respiratory health data. Studies in Health Sciences, 6(4):e21945–e21945.

Shaman, J. and Karspeck, A. (2012). Forecasting seasonal outbreaks of influenza. Proceedings of the National Academy of Sciences, 109(50):20425–20430.

Silva, G. D. M. d. et al. (2020). Identificação de microrregiões com subnotificação de casos de tuberculose no brasil, 2012 a 2014. Epidemiologia e Serviços de Saúde, 29:e2018485.

Souza, L. M. and Silva, C. A. (2020). Uso do sinan como ferramenta para a vigilância epidemiológica no brasil. Revista Brasileira de Epidemiologia, 23:e200045.

Wang, S. et al. (2021). Research on expansion and classification of imbalanced data based on smote algorithm. Scientific Reports, 11(1):24039.
Publicado
19/07/2026
OLIVEIRA, Samila Raphaela de; LEMOS, Victor Luan de Lima; SOUZA, Tarsis Marinho de; SILVA, Cledja Karina Rolim da. Computational Modeling and Artificial Intelligence for Infectious Disease Outbreak Risk Classification in Maceió, Alagoas, Brazil. In: ENCONTRO NACIONAL DE COMPUTAÇÃO DOS INSTITUTOS FEDERAIS (ENCOMPIF), 13. , 2026, Gramado/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2026 . p. 33-40. ISSN 2763-8766. DOI: https://doi.org/10.5753/encompif.2026.20219.