Issue Labeling Dynamics in Open-Source Projects: A Comprehensive Analysis

  • Joselito Jr UFBA
  • Lidia P. G. Nascimento UFBA
  • Alcemir Santos UESPI
  • Ivan Machado UFBA


Open-source repositories play a vital role in modern software development, facilitating collaboration and code sharing among developers worldwide. In this study, we investigate the usage of labels in GitHub repositories to understand their impact on the issue resolution process and project management.We employ data mining techniques to gather a dataset comprising 10,673,459 issues from 13,280 repositories hosted on GitHub’s featured topics list. Our study design involves four phases: repository selection, mining repository issues, pre-processing issues’ components, and data processing to address research questions (RQs). The first RQ focuses on the frequency and usage of standard and custom labels in repositories. The second and third RQs delve into the average time for labeling issues and defining the triage phase from labeling practices. We found that 73.14% of repositories employ issue labeling, with most labeling activity concentrated before the 100th day since issue opening. This rapid labeling process is often followed by a structured label change pattern, potentially corresponding to specific issue phases like triage, implementation, or change validation. Analyzing time intervals between label changes, we observed that most issues undergo triage within 1 to 100 days, with labels prioritized based on their frequency in the resolution process. Our analysis sheds light on labels’ significance in organizing and classifying issues through a systematic triage process within open-source repositories. Labels serve as social and technical elements, contributing to enhanced organization, identification, implementation, and validation of code changes. These findings provide valuable insights into the effective management and maintenance of open-source projects, aiding developers and project managers in optimizing issue resolution processes. The results and scripts from our study are available in the supplementary material repository for further exploration and reference by the software engineering community.
Palavras-chave: Open-source Repositories, Issue, Issue labeling, Defect, Triage, Issue Life Cycle


JR, Joselito; NASCIMENTO, Lidia P. G.; SANTOS, Alcemir; MACHADO, Ivan. Issue Labeling Dynamics in Open-Source Projects: A Comprehensive Analysis. In: SIMPÓSIO BRASILEIRO DE COMPONENTES, ARQUITETURAS E REUTILIZAÇÃO DE SOFTWARE (SBCARS), 18. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 51-60. DOI: