Toward Reliable Forward Snowballing in Systematic Literature Reviews: A Comparative Study and Framework Proposal

  • Jailma Januário PUC-Rio
  • Maria Isabel Nicolau PUC-Rio
  • Katia R. Felizardo UTFPR
  • Juliana Alves Pereira PUC-Rio

Resumo


Systematic Literature Reviews (SLRs) play a vital role in the software engineering field by synthesizing existing knowledge, identifying research gaps, and guiding future investigations with methodological rigor. Given the rapid growth of published research, techniques such as snowballing are essential. Forward snowballing, in particular, helps to discover newer studies that cite key seed papers, enhancing the completeness of SLRs. Despite its value, when performed manually, it remains a labor-intensive and error-prone task. To address this challenge, we mapped existing tools that support forward snowballing. Our analysis focuses on the capabilities of these tools to automate core tasks, including the identification of relevant articles and the extraction of bibliographic metadata. We critically examine their reliability by comparing their outputs against a manually curated forward snowballing process. Key evaluation criteria include the completeness and relevance of the metadata retrieved, the quality of the retrieved scientific databases, and the ease of integration into SLR workflows. As future research, based on our findings, we will propose a framework that takes advantage of existing tools to automate forward snowballing.
Palavras-chave: Systematic Literature Review, Automation, Forward Snowballing

Referências

Aris Ampatzoglou, Panagiotis Pratikakis, and Nikos Bikakis. 2019. A systematic review of approaches and tools supporting evidence synthesis. Research Synthesis Methods 10, 1 (2019), 72–82. DOI: 10.1002/jrsm.1335

D. Badampudi, C. Wohlin, and K. Petersen. 2015. Experiences from using snowballing and database searches in systematic literature studies. In International Conference on Evaluation and Assessment in Software Engineering (Nanjing, China) (EASE’ 15). ACM, New York, NY, USA, Article 17, 10 pages. DOI: 10.1145/2745802.2745818

Lutz Bornmann and Loet Leydesdorff. 2014. Scientometrics in a changing research landscape. In Springer Handbook of Science and Technology Indicators. Springer, 347–360. DOI: 10.1007/978-3-030-02511-3_15

Victoria Cole and Mish Boutet. 2023. ResearchRabbit. The journal of the canadian health libraries association 44, 2 (2023), 43.

Felipe R. da Silva and Marcus V. D. de Moura Neto. 2022. Automation of study selection in systematic Reviews: A machine learning approach. Journal of Information Science 48, 2 (2022), 123–145. DOI: 10.1177/01655515211012345

Katia Romero Felizardo, Anderson Y Iwazaki da Silva, Érica Ferreira de Souza, Nandamudi L Vijaykumar, and Elisa Yumi Nakagawa. 2018. Evaluating strategies for forward fnowballing application to support secondary studies updates: emergent results. In Brazilian Symposium on Software Engineering (São Carlos, Brazil) (SBES’ 18). Sociedade Brasileira de Computação, Brazil, 184–189.

Suzanne Fricke. 2018. Semantic scholar. Journal of the Medical Library Association: JMLA 106, 1 (2018), 145.

Borja Gonzalez-Pereira, Vicente P. Guerrero-Bote, and Félix Moya-Anegón. 2010. The SJR indicator: A new indicator of journals’ scientific prestige. Journal of Informetrics 4, 3 (2010), 379–391. DOI: 10.1016/j.joi.2010.03.002

María González, John Smith, and Simon Lee. 2023. Enhancing forward snowballing in systematic reviews using open tools. In Proceedings of the 2023 International Conference on Software Engineering and Knowledge. IEEE, 56–65.

Michael Gusenbauer. 2024. Beyond google scholar, scopus, and web of science: an evaluation of the backward and forward citation coverage of 59 databases’ citation indices. Research Synthesis Methods 15, 5 (2024), 802–817.

Drahomira Herrmannova and Petr Knoth. 2021. Large-Scale comparison of bibliographic data Sources: scopus, web of science, dimensions, crossref, and microsoft academic. Quantitative Science Studies 2, 1 (2021), 20–41. DOI: 10.1162/qss_a_00114

Barbara Kitchenham and Stuart Charters. 2007. Guidelines for performing systematic literature reviews in software engineering. EBSE Technical Report EBSE–2007–01. Keele University and Durham University. Available at Keele and Durham Universities.

Diogo Adário Marassi, Juliana Alves Pereira, and Katia Romero Felizardo. 2025. Comparing LLMs and Proposing an ML-Based Approach for Search String Generation in Systematic Literature Reviews. In Brazilian Symposium on Software Engineering, Insightful Ideas and Emerging Results Track (SBES IIER). SOL, 1–7.

Iain J. Marshall and Byron C. Wallace. 2019. Toward systematic review automation: a practical guide to using machine learning tools in research synthesis. Systematic Reviews 8, 163 (2019). DOI: 10.1186/s13643-019-1074-9

Juliana Alves Pereira, Mathieu Acher, Hugo Martin, Jean-Marc Jézéquel, Goetz Botterweck, and Anthony Ventresque. 2021. Learning software configuration spaces: A systematic literature review. Journal of Systems and Software 182 (2021), 111044.

Jason Priem, Heather Piwowar, and Richard Orr. 2022. OpenAlex: a fully-open index of scholarly works, authors, venues, institutions, and concepts. arXiv preprint arXiv:2205.01833 (2022).

Dwi Sulisworo. 2023. Exploring research idea growth with litmap: visualizing literature review graphically. Bincang Sains dan Teknologi 2, 02 (2023), 48–54.

Daniel Torres-Salinas, Nicolas Robinson-Garcia, and Enrique Herrera-Viedma. 2023. New trends in bibliometric APIs: a comparative analysis. Journal of Informetrics 17, 3 (2023), 103385. DOI: 10.1016/j.joi.2023.101389

Marija Ðukić, Milica Škembarević, Olga Jejić, and Ivan Luković. 2025. Towards the utilization of AI-powered assistance for systematic literature review. In European Conference on Advances in Databases and Information Systems. Springer, 195–205.

Martijn Visser, Nees Jan van Eck, and Ludo Waltman. 2021. Finding citations for pubMed: a large-scale comparison between five freely available bibliographic data sources. Journal of the Association for Information Science and Technology 72, 9 (2021), 1077–1090. DOI: 10.1002/asi.24433

Ludo Waltman and Nees Jan van Eck. 2013. Source normalized indicators of citation impact: An overview of different approaches and an empirical comparison. Scientometrics 96, 3 (2013), 699–716. DOI: 10.1007/s11192-012-0913-4

Claes Wohlin. 2014. Guidelines for snowballing in systematic literature studies and a replication in software engineering. In EASE ’14: Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. ACM, 1–10. DOI: 10.1145/2601248.2601268
Publicado
22/09/2025
JANUÁRIO, Jailma; NICOLAU, Maria Isabel; FELIZARDO, Katia R.; PEREIRA, Juliana Alves. Toward Reliable Forward Snowballing in Systematic Literature Reviews: A Comparative Study and Framework Proposal. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE (SBES), 39. , 2025, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 762-768. ISSN 2833-0633. DOI: https://doi.org/10.5753/sbes.2025.11559.