Covid Data Analytics: Repository of Data from Multiple Sources on a COVID-19 Pandemic in Brazil

  • Pedro Moreira Federal University of Minas Gerais
  • Rodrigo Fonseca Federal University of Minas Gerais
  • Pedro Loures Alzamora Federal University of Minas Gerais
  • Ramon A. S. Franco Federal University of Western Bahia https://orcid.org/0000-0002-2653-9835
  • Janaina Guiginski Federal University of Minas Gerais
  • Evandro L. T. P. Cunha Federal University of Minas Gerais
  • Tereza Bernardes Federal University of Minas Gerais
  • Bruno Chagas Federal University of Minas Gerais
  • Kícila Ferregueti Federal University of Minas Gerais
  • Luana Passos Federal University of Minas Gerais
  • Luísa Cardoso Federal University of Minas Gerais
  • Raquel Schneider Federal University of Minas Gerais
  • Wallace Pereira Federal University of Minas Gerais
  • Ana Paula Couto da Silva Federal University of Minas Gerais
  • Wagner Meira Jr. Federal University of Minas Gerais

Abstract


This paper presents the construction and deployment of a data repository used and developed under the Covid Data Analytics (CDA) project, executed by the Department of Computer Science at UFMG. The project aimed to monitor aspects related to the social, economic and epidemiological scenario of COVID-19 in Brazil by analyzing data from official and non official sources, online social networks, and the web in general. The construction of the repository, which contains 18 attributes and 1086 records, was based on collecting data directly from the selected sources, which were later enriched and, finally, made available through a search tool developed exclusively for them.
Keywords: Covid-19, Dataset, Social networks

References

Aiello, A. E., Renson, A., and Zivich, P. N. (2020). Social media– and internet-based isease surveillance for public health. Annual Review of Public Health, 41(1):101– 118. PMID: 31905322.

Bastos, S. B. and Cajueiro, D. O. (2020). Modeling and forecasting the early evolution of the Covid-19 pandemic in Brazil. Scientific Reports, 10(1):1–10.

Du, J., Xu, J., Song, H., Liu, X., and Tao, C. (2017). Optimization on machine learning based approaches for sentiment analysis on hpv vaccines related tweets. Journal of biomedical semantics, 8(1):1–7.

Gomide, J., Veloso, A., Meira, W., Almeida, V., Benevenuto, F., Ferraz, F., and Teixeira, M. (2011). Dengue surveillance based on a computational model of spatio-temporal locality of twitter. In Proceedings of the 3rd International Web Science Conference, WebSci ’11, New York, NY, USA. Association for Computing Machinery.

Kang, G. J., Ewing-Nelson, S. R., Mackey, L., Schlitt, J. T., Marathe, A., Abbas, K. M., and Swarup, S. (2017). Semantic network analysis of vaccine sentiment in online social media. Vaccine, 35(29):3621–3638.

Li, C., Chen, L. J., Chen, X., Zhang, M., Pang, C. P., and Chen, H. (2020). Retrospective analysis of the possibility of predicting the covid-19 outbreak from internet searches and social media data, china, 2020. Eurosurveillance, 25(10).

Marques-Toledo, C. d. A., Degener, C. M., Vinhal, L., Coelho, G., Meira, W., Codeço, C. T., and Teixeira, M. M. (2017). Dengue prediction by the web: Tweets are a useful tool for estimating and forecasting dengue at country and city level. PLoS neglected tropical diseases, 11(7):e0005729.

Moreira, P. V. X., Franco, R. A. S., Fonseca, R. M., Prado, A. C. T., Leal, L., Mendes, G. N., and Rezende, T. A. V. (2021). Covid Data Analytics: Repositório de Dados Provenientes de Múltiplas Fontes sobre a Pandemia de COVID-19 no Brasil https://doi.org/10.5281/zenodo.5176798. Zenodo.

Peixoto, P. S., Marcondes, D., Peixoto, C., and Oliva, S. M. (2020). Modeling future spread of infections via mobile geolocation data and population dynamics. an application to COVID-19 in Brazil. PloS one, 15(7):e0235732.

Pereira, I. G., Guerin, J. M., Silva Junior, A. G., Garcia, G. S., Piscitelli, P., Miani, A., Distante, C., and Gonçalves, L. M. G. (2020). Forecasting Covid-19 dynamics in Brazil: a data driven approach. International Journal of Environmental Research and Public Health, 17(14):5115.

Ranzani, O. T., Bastos, L. S., Gelli, J. G. M., Marchesi, J. F., Baiao, F., Hamacher, S., and Bozza, F. A. (2021). Characterisation of the first 250 000 hospital admissions for COVID-19 in Brazil: a retrospective analysis of nationwide data. The Lancet Respiratory Medicine, 9(4):407–418.

Rey S. J., Arribas-Bel D., W. L. J. (2020). Geographic data science with pysal and the pydata stack.

Sultana, A., Tasnim, S., Hossain, M. M., Bhattacharya, S., and Purohit, N. (2021). Digital screen time during the covid-19 pandemic: a public health concern. F1000 Research, 10(81):81.

Veiga e Silva, L., de Andrade Abi Harb, M. D. P., Dos Santos, A. M. T. B., de Mattos Teixeira, C. A., Gomes, V. H. M., Cardoso, E. H. S., da Silva, M. S., Vijaykumar, N., Carvalho, S. V., Frances, C. R. L., et al. (2020). COVID-19 mortality underreporting in Brazil: analysis of data from government internet portals. Journal of medical Internet research, 22(8):e21413.
Published
2021-10-04
MOREIRA, Pedro et al. Covid Data Analytics: Repository of Data from Multiple Sources on a COVID-19 Pandemic in Brazil. In: DATASET SHOWCASE WORKSHOP (DSW), 3. , 2021, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 107-116. DOI: https://doi.org/10.5753/dsw.2021.17419.