Detecção e previsão de eventos de alto impacto utilizando dados de redes sociais online
Abstract
This work aims to design, evaluate and apply regression methods to early detection of events using public data available in online social networks. The process comprises four phases, which consists of verifying the viability of predicting the event through informal data, identifying the models to be used, calculating the parameters of the prediction function and evaluating the model in a case study. The context is the dengue outbreak in Brazil, integrating the Dengue Observatory project, where the resulting models were able to correctly predict the severity of the surges, in a per week basis and for the largest Brazilian cities, for 99.12% of the disease incidence values.References
Bourke, P. (1996). Cross correlation. http://paulbourke.net/miscellaneous/correlate/.
Brito, D., Gomide, J., Santos, W., Meira Jr., W., Veloso, A., and Almeida, V. (2012). Um sistema de alarme para vigilância epidemiológica de rumores utilizando redes sociais. In Proceedings of the 27th Brazilian Symposium on Databases, pages 225–232.
Gomide, J. S. (2012). Mineração de Redes Sociais para Detecção e Previsão de Eventos Reais. Master’s thesis, Universidade Federal de Minas Gerais, BR.
Markovsky, I. and Van Huffel, S. (2007). Overview of total least-squares methods. Signal Process., 87(10):2283–2302.
Petras, I. and Bednarova, D. (2010). Total least squares method. http://www.mathworks.com/matlabcentral/fileexchange/31109.
R Development Core Team (2011). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Veloso, A., Meira Jr., W., and Zaki, M. J. (2006). Lazy associative classification. In International Conference on Data Mining, pages 645–654. IEEE Computer Society.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Springer, New York, fourth edition. ISBN 0-387-95457-0.
Ver Hoef, J. and Boveng, P. (2007). Quasi-poisson vs. negative binomial regression: how should we model overdispersed count data? Ecology, 88(11):2766–72.
Zeileis, A., Kleiber, C., and Jackman, S. (2008). Regression models for count data in r. Journal of Statistical Software, 27(8):1–25.
Brito, D., Gomide, J., Santos, W., Meira Jr., W., Veloso, A., and Almeida, V. (2012). Um sistema de alarme para vigilância epidemiológica de rumores utilizando redes sociais. In Proceedings of the 27th Brazilian Symposium on Databases, pages 225–232.
Gomide, J. S. (2012). Mineração de Redes Sociais para Detecção e Previsão de Eventos Reais. Master’s thesis, Universidade Federal de Minas Gerais, BR.
Markovsky, I. and Van Huffel, S. (2007). Overview of total least-squares methods. Signal Process., 87(10):2283–2302.
Petras, I. and Bednarova, D. (2010). Total least squares method. http://www.mathworks.com/matlabcentral/fileexchange/31109.
R Development Core Team (2011). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Veloso, A., Meira Jr., W., and Zaki, M. J. (2006). Lazy associative classification. In International Conference on Data Mining, pages 645–654. IEEE Computer Society.
Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S. Springer, New York, fourth edition. ISBN 0-387-95457-0.
Ver Hoef, J. and Boveng, P. (2007). Quasi-poisson vs. negative binomial regression: how should we model overdispersed count data? Ecology, 88(11):2766–72.
Zeileis, A., Kleiber, C., and Jackman, S. (2008). Regression models for count data in r. Journal of Statistical Software, 27(8):1–25.
Published
2013-07-23
How to Cite
BRITO, Denise E. F.; MEIRA JR., Wagner.
Detecção e previsão de eventos de alto impacto utilizando dados de redes sociais online. In: SBC UNDERGRADUATE RESEARCH CONTEST (CTIC-SBC), 32. , 2013, Maceió.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2013
.
p. 142-151.