Mining repositories to analyze the lifecycle of frameworks and libraries

  • Ronaldo Rubens Gesse Júnior UNESP
  • Higor Amario de Souza UNESP

Resumo


In a constantly evolving technological landscape, it is crucial to choose the right components and technologies for a software project to ensure its successful development. Frameworks and libraries are essential components that provide functionality to the code and accelerate the development process. They assist teams in delivering results more efficiently to the end user through software reuse. This work proposes using Mining Software Repositories (MSR) to analyze the lifecycle of frameworks and libraries.We aim to understand whether a framework or library is properly updated, maintained, and sought after by the community, which may indicate its lifecycle stage.We explored data from several open source projects: the number of commits and contributors over time. Also, we are using data from Google Trends to explore the developer community’s interest in such libraries and frameworks. We are using a trend metric – Exponential Moving Average (EMA) – over the prior mentioned variables to indicate the lifecycle stage of such frameworks and libraries. The initial results show that our approach can distinguish lifecycle trends for frameworks within the same domain. Our future research will involve examining additional MSR data (such as pull requests, issues, and code changes), obtaining other data sources (Q & A sites), and applying time series Machine Learning techniques to improve the analysis.

Palavras-chave: Mining software repositories, software reuse, frameworks, libraries

Referências

Marek Buszman. 2023. Modern Ruby Frameworks Comparison: RoR vs Hanami. [Online]. Available from: [link]. Acessed in: 20 oct. 2023.

Jailton Coelho, Marco Tulio Valente, Luciano Milen, and Luciana L. Silva. 2020. Is this GitHub project maintained? Measuring the level of maintenance activity of open-source projects. Information and Software Technology 122 (2020), 106274. DOI: 10.1016/j.infsof.2020.106274

Hulin Dai, Xuan Peng, Xuanhua Shi, Ligang He, Qian Xiong, and Hai Jin. 2022. Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment. Science China Information Sciences 65 (2022), 1–17.

Alexandre Decan, Tom Mens, and Philippe Grosjean. 2019. An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empirical Software Engineering 24, 1 (2019), 381–416.

Alexander Elder. 2002. Come into my trading room: a complete guide to trading. John Wiley & Sons, New York, NY.

Boni Garcia. 2017. Mastering Software Testing with JUnit 5: Comprehensive guide to develop high quality Java applications. Packt Publishing Ltd, Birmingham, UK.

Ronaldo Rubens Gesse Júnior. 2023. Mineração de Repositórios para análise de ciclos de software. Capstone project (Bachelor of Computer Science), São Paulo State University, School of Sciences. [link]

Ahmed E Hassan. 2008. The road ahead for mining software repositories. In Proceedings of the 2008 Frontiers of Software Maintenance (Beijing, China) (FoSM 2008). IEEE, Piscataway, NJ, 48–57.

André Hora and Marco Tulio Valente. 2015. apiwave: Keeping track of API popularity and migration. In Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME’15). IEEE, Piscataway, NJ, 321–323.

Ralph E Johnson. 1997. Frameworks=(components + patterns). Commun. ACM 40, 10 (1997), 39–42.

Ralph E Johnson and Brian Foote. 1988. Designing reusable classes. Journal of Object-Oriented Programming 1, 2 (1988), 22–35.

Seung-Pyo Jun, Hyoung Sun Yoo, and San Choi. 2018. Ten years of research change using Google Trends: From the perspective of big data utilizations and applications. Technological Forecasting and Social Change 130 (2018), 69–87. DOI: 10.1016/j.techfore.2017.11.009

Yana Momchilova Mileva, Valentin Dallmeier, and Andreas Zeller. 2010. Mining API popularity. In Proceedings of the 5th International Academic and Industrial Conference: Testing – Practice and Research Techniques (TAIC PART 2010). Springer-Verlag, Berlin, Germany, 173–180.

Shaikh Mostafa, Rodney Rodriguez, and Xiaoyin Wang. 2017. Experience paper: a study on behavioral backward incompatibilities of Java software libraries. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA ’17). ACM, New York, NY, 215–225.

Suhaib Mujahid, Diego Elias Costa, Rabe Abdalkareem, and Emad Shihab. 2023. Where to Go Now? Finding Alternatives for Declining Packages in the npm Ecosystem. In Proceedings of the 2023 38th IEEE/ACM International Conference on Automated Software Engineering (Luxembourg, Luxembourg) (ASE’23). IEEE, Piscataway, NJ, 1628–1639. DOI: 10.1109/ASE56229.2023.00119

Suhaib Mujahid, Diego Elias Costa, Rabe Abdalkareem, Emad Shihab, Mohamed Aymen Saied, and Bram Adams. 2022. Toward Using Package Centrality Trend to Identify Packages in Decline. IEEE Transactions on Engineering Management 69, 6 (2022), 3618–3632. DOI: 10.1109/TEM.2021.3122012

Sarah Nadi, Stefan Krüger, Mira Mezini, and Eric Bodden. 2016. Jumping through hoops: why do Java developers struggle with cryptography APIs?. In Proceedings of the 38th International Conference on Software Engineering (Austin, Texas) (ICSE ’16). ACM, New York, NY, 935–946. DOI: 10.1145/2884781.2884790

Romain Robbes, Mircea Lungu, and David Röthlisberger. 2012. Howdo developers react to API deprecation? The case of a Smalltalk ecosystem. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering (FSE’12). ACM, New York, NY, 1–11.

Martin P Robillard and Robert DeLine. 2011. A field study of API learning obstacles. Empirical Software Engineering 16 (2011), 703–732.

Shilpy Sharma, David A Swayne, and Charlie Obimbo. 2016. Trend analysis and change point techniques: a survey. Energy, Ecology and Environment 1 (2016), 123–130.

Davide Spadini, Maurício Aniche, and Alberto Bacchelli. 2018. PyDriller: Python framework for mining software repositories. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’18). ACM, New York, NY, 908––911. DOI: 10.1145/3236024.3264598

Chengwei Xiao, Jiaqi Ye, Rui Máximo Esteves, and Chunming Rong. 2016. Using Spearman’s correlation coefficients for exploratory data analysis on big dataset. Concurrency and Computation: Practice and Experience 28, 14 (2016), 3866–3878.

Hao Zhong and Zhendong Su. 2013. Detecting API documentation errors. In Proceedings of the 2013 ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages & Applications (Indianapolis, IN) (OOPSLA ’13). ACM, New York, NY, USA, 803—-816. DOI: 10.1145/2509136.2509523
Publicado
30/09/2024
GESSE JÚNIOR, Ronaldo Rubens; SOUZA, Higor Amario de. Mining repositories to analyze the lifecycle of frameworks and libraries. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SOFTWARE (SBES), 38. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 599-605. DOI: https://doi.org/10.5753/sbes.2024.3568.