How COVID-19 Impacted Data Science: a Topic Retrieval and Analysis from GitHub Projects’ Descriptions

  • Amanda C. R. Tavares Universidade Federal de Minas Gerais (UFMG)
  • Natércia A. Batista Universidade Federal de Minas Gerais (UFMG)
  • Mirella M. Moro Universidade Federal de Minas Gerais (UFMG)


We present a data-driven research over code repositories that are data science oriented. The goal is to compare their topics of interest and evolution over the COVID-19 pandemic period by analyzing Jupyter Notebook and Python projects from a year before and during the pandemic. We employ a state-of-art algorithm to find topics based on the repositories descriptions, and compare the performance of tuning its hyperparameters for better accuracy.

Palavras-chave: Data Science, GitHub, Python, Jupyter Notebooks, COVID-19


TAVARES, Amanda C. R.; BATISTA, Natércia A.; MORO, Mirella M.. How COVID-19 Impacted Data Science: a Topic Retrieval and Analysis from GitHub Projects’ Descriptions. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 36. , 2021, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 325-330. ISSN 2763-8979. DOI: