Comparação de modelos de Machine Learning aplicados a previsão de casos totais de Dengue

  • Thiago Carvalho Pontifícia Universidade Católica do Rio de Janeiro
  • Gabriel Tenório Pontifícia Universidade Católica do Rio de Janeiro
  • Karla Figueiredo Universidade do Estado do Rio de Janeiro
  • Marley Vellasco Pontifícia Universidade Católica do Rio de Janeiro
  • Wouter Caarls Pontifícia Universidade Católica do Rio de Janeiro

Resumo


A dengue e uma doença endêmica que ocorre principalmente em áreas tropicais, devido à sua transmissão através de mosquitos. Usando mecanismos de pré-processamento e de aprendizado de máquina, esse trabalho objetiva desenvolver um modelo de previsão que estabeleça uma relação existente entre as condicões de uma cidade e a proliferação de epidemia de dengue, como parte da competição 'DengAI - predicting disease spread', fornecida pela plataforma DrivenData. Dentre os modelos implementados, o metodo Ensemble entre o Random Forest e Redes Neurais obtiveram a melhor performance, com melhora de 4,5% em relação ao Benchmark.

Palavras-chave: Data Mining, Aprendizado de Máquina, Dengue, Regressão, Redes Neurais Artificiais

Referências

Benesty, J., Chen, J., Huang, Y., and Cohen, I. (2009). Pearson correlation coefficient. In Noise reduction in speech processing, pages 1–4. Springer.

Bennett, D. A. (2001). How can i deal with missing data in my study? Australian and New Zealand journal of public health, 25(5):464–469.

Breiman, L. (2001). Random forests. Machine learning, 45(1):5–32.

Chua, M., Deb, S., and Acebedo, C. M. (2017). An ensemble prediction approach to weekly dengue cases forecasting based on climatic and terrain conditions. Journal of Health and Social Sciences, 2:257–272.

Dudani, S. A. (1976). The distance-weighted k-nearest-neighbor rule. IEEE Transactions on Systems, Man, and Cybernetics, (4):325–327.

Focks, D. A., Daniels, E., Haile, D. G., and Keesling, J. E. (1995). A simulation model of the epidemiology of urban dengue fever: literature analysis, model development, preliminary validation, and samples of simulation results. The American journal of tropical medicine and hygiene, 53(5):489–506.

Fuller, D. O., Troyo, A., and Beier, J. C. (2009). El nino southern oscillation and vegetation dynamics as predictors of dengue fever cases in costa rica. Environmental Research Letters, 4(1):014011.

Gardner, M. W. and Dorling, S. (1998). Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. Atmospheric environment, 32(14-15):2627–2636.

Gers, F. A., Schmidhuber, J., and Cummins, F. (1999). Learning to forget: Continual prediction with lstm.

Granitto, P. M., Furlanello, C., Biasioli, F., and Gasperi, F. (2006). Recursive feature elimination with random forest for ptr-ms analysis of agroindustrial products. Chemometrics and Intelligent Laboratory Systems, 83(2):83–90.

Gubler, D. J., Reiter, P., Ebi, K. L., Yap, W., Nasci, R., and Patz, J. A. (2001). Climate variability and change in the united states: potential impacts on vector-and rodent-borne diseases. Environmental health perspectives, 109(suppl 2):223–233.

Halide, H. and Ridd, P. (2008). A predictive model for dengue hemorrhagic fever epidemics. International journal of environmental health research, 18(4):253–265.

Kuno, G. (1997). Factors influencing the transmission of dengue viruses. Dengue and dengue hemorrhagic fever, 1:23–39.

Kwon, Y.-S., Bae, M.-J., Chung, N., Lee, Y.-R., Hwang, S., Kim, S., Choi, Y., and Park, Y.-S. (2015). Modeling occurrence of urban mosquitos based on land use types and meteorological factors in korea. International journal of environmental research and public health, 12(10):13131–13147.

Lambrechts, L., Paaijmans, K. P., Fansiri, T., Carrington, L. B., Kramer, L. D., Thomas, M. B., and Scott, T. W. (2011). Impact of daily temperature fluctuations on dengue virus transmission by aedes aegypti. Proceedings of the National Academy of Sciences, 108(18):7460–7465.

Marquardt, D. W. (1963). An algorithm for least-squares estimation of nonlinear parameters. Journal of the society for Industrial and Applied Mathematics, 11(2):431–441.

Mittelman, R. (2015). Time-series modeling with undecimated fully convolutional neural networks. arXiv preprint arXiv:1508.00317.

Robnik-Šikonja, M. (2004). Improving random forests. In European conference on machine learning, pages 359–370. Springer.

Rodhain, F. R. (1997). Mosquito vectors and dengue virus-vector relationships. Dengue and dengue hemorrhagic fever, pages 45–60.

Sathler, C. and Luciano, J. (2017). Predictive modeling of dengue fever epidemics: A neural network approach.

Scavuzzo, J. M., Trucco, F., Espinosa, M., Tauro, C. B., Abril, M., Scavuzzo, C. M., and Frery, A. C. (2018). Modeling dengue vector population using remotely sensed data and machine learning. Acta tropica, 185:167–175.

Shi, Q., Abdel-Aty, M., and Lee, J. (2016). A bayesian ridge regression analysis of congestion’s impact on urban expressway safety. Accident Analysis & Prevention, 88:124–137.
Publicado
15/10/2019
CARVALHO, Thiago; TENÓRIO, Gabriel; FIGUEIREDO, Karla; VELLASCO, Marley; CAARLS, Wouter. Comparação de modelos de Machine Learning aplicados a previsão de casos totais de Dengue. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 16. , 2019, Salvador. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 658-669. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2019.9323.

Artigos mais lidos do(s) mesmo(s) autor(es)