Predição de Risco de Óbito por Febre Amarela em Diferentes Estágios do Acompanhamento Clínico usando Aprendizado de Máquina
Resumo
Apesar da disponibilidade vacinal, a febre amarela mantém elevada letalidade, e modelos de aprendizado de máquina (ML) para estratificação individual de risco ainda são escassos. Este estudo desenvolveu modelos de ML para predição de risco de óbito com base em dados nacionais brasileiros de vigilância, estruturando os atributos preditores de forma temporalmente coerente em três estágios do acompanhamento clínico: notificação (M1), avaliação inicial (M2) e fase tardia (M3). Cinco algoritmos baseados em árvores foram comparados por validação cruzada aninhada com otimização bayesiana, incluindo calibração probabilística e ajuste de limiar sob custo assimétrico. No conjunto holdout, o CatBoost apresentou melhor desempenho, com ROC-AUC/PR-AUC de 0,680/0,226 (M1), 0,764/0,321 (M2) e 0,814/0,434 (M3), evidenciando ganho progressivo conforme aumenta a disponibilidade de dados clínicos e laboratoriais. A análise de explicabilidade baseada em SHAP permitiu identificar os principais fatores associados ao risco estimado, ampliando a transparência do modelo e seu potencial de aplicação em cenários de vigilância em saúde.
Referências
Brasil (2020). Manual de manejo clínico da febre amarela. MS, Brasília. Cawley, G. C. and Talbot, N. L. C. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. JMLR, 11:2079–2107.
Ceia-Hasse, A. et al. (2023). Forecasting the abundance of disease vectors with deep learning. Ecological Informatics, 78:102272.
da Silva Neto, S. R. et al. (2022). Machine learning and deep learning techniques to support clinical diagnosis of arboviral diseases: A systematic review. PLOS Neglected Tropical Diseases, 16(1):e0010061.
1de Araújo, T. O., de Miranda, V. L., and Gurgel-Gonçalves, R. (2024). Ai-driven convolutional neural networks for accurate identification of yellow fever vectors. Parasites & Vectors, 17(1):329.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall/CRC. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8):861–874.
Gawriljuk, V. O. et al. (2021). Development of machine learning models and the discovery of a new antiviral compound against yellow fever virus. Journal of Chemical Information and Modeling, 61(8):3804–3813.
Gaythorpe, K. A. M. et al. (2021). The global burden of yellow fever. eLife, 10:e64670. Gneiting, T. and Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102(477):359–378.
Javed, M. F. et al. (2024). Forecasting the strength of preplaced aggregate concrete using interpretable machine learning approaches. Scientific reports, 14(1):8381. Kallas, E. G. et al. (2019). Predictors of mortality in patients with yellow fever: an observational cohort study. The Lancet Infectious Diseases, 19(7):750–758.
Lima, C. L. et al. (2022). Temporal and spatiotemporal arboviruses forecasting by machine learning: A systematic review. Frontiers in Public Health, 10:900077.
Little, R. J. A. and Rubin, D. B. (2019). Statistical Analysis with Missing Data. John Wiley & Sons, Hoboken, NJ, 3 edition.
Lundberg, S. M. and Lee, S. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems (NeurIPS), 30.
OMS (2013). Vaccines and vaccination against yellow fever: Who position paper. Weekly Epidemiological Record, 88:269–284.
OMS (2025). WHO guidelines for clinical management of arboviral diseases: dengue, chikungunya, zika and yellow fever.
Peiffer-Smadja, N. et al. (2020). Machine learning for clinical decision support in infectious diseases: a narrative review of current applications. Clinical Microbiology and Infection, 26(5):584–595.
Possas, C. et al. (2018). Yellow fever outbreak in Brazil: the puzzle of rapid viral spread and challenges for immunisation. Memórias do Instituto Oswaldo Cruz, 113.
Saito, T. and Rehmsmeier, M. (2015). The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets. PLOS One, 10(3):e0118432.
Santos, J. D. et al. (2025). The yellow fever outbreak in Brazil (2016–2018): How a low vaccination coverage can contribute to emerging disease outbreaks. Microorganisms, 13(6):1287.
Steyerberg, E. et al. (2010). Assessing the performance of prediction models: a framework for traditional and novel measures. Epidemiology, 21(1).
Verma, P. et al. (2024). Fuzzy-centric fog–cloud inspired deep interval bi-lstm healthcare framework for predicting yellow fever outbreak. IEEE Transactions on Fuzzy Systems, 32(10):5508–5519.
Züfle, A. et al. (2024). Leveraging simulation data to understand bias in predictive models of infectious disease spread. ACM Transactions on Spatial Algorithms and Systems, 10(2):1–22.
