A Study of Popular Artificial Intelligence Python Modules in Open Source Projects

  • Camila Reno UNIFEI
  • João Marcos Cardoso UNIFEI
  • Viviane Cordeiro UNIFEI
  • Paulo Meirelles USP
  • Phyllipe Francisco UNIFEI

Resumo


The increase in the use and popularity of Artificial Intelligence (AI) is directly related to the rise of the Python language, which is recognized for its simplicity and efficiency in AI projects. In this context, this study aimed to analyze the use of AI modules in public GitHub repositories. For this, the repository mining technique was applied using the Anonymous tool. A total of 142 popular repositories were analyzed, and we identified 17 that contained a set of predefined AI modules of interest. The most popular AI module identified was TensorFlow, present in just over 94% of these repositories. This highlights TensorFlow’s dominance in AI projects, due to its robustness, active community, and integration with other tools. Furthermore, the predominance of this library reflects developers’ preference for well-supported solutions with extensive practical applications. Our results complement the lists of popular libraries available online in grey literature, supporting professionals in making informed decisions when choosing libraries. They can align their projects with the most common and successful open-source practices.

Referências

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). Tensor-Flow: Large-scale machine learning on heterogeneous systems. Software available from [link].

Ali, M. (2020). Pycaret: An open source, low-code machine learning library. Disponível em [link], acessado em 1 de dezembro de 2024.

Borges, H. and Tulio Valente, M. (2018). What’s in a github star? understanding repository starring practices in a social coding platform. Journal of Systems and Software, 146:112–129.

Community, S. (2024). Statsmodels: Statistical models in python. Acesso em: 01 dez. 2024.

Foundation, P. S. (2024a). Python modules tutorial. Acesso em: 15 dez. 2024.

Foundation, T. A. S. (2024b). Apache MXNet: A Flexible and Efficient Deep Learning Framework. The Apache Software Foundation. Versão 1.9.1.

GitHub (2020). The state of the Octoverse 2020.

Gomes, L. I. E., Fernández Marcial, V., and Santos, M. N. (2021). O impacto da inteligência artificial nos serviços de informação: inovação e perspetivas para as bibliotecas. In Organização do Conhecimento no Horizonte 2030: Desenvolvimento Sustentável e Saúde: Atas do V Congresso ISKO Espanha-Portugal., pages 393–405. Centro de Estudos Clássicos, Colibri.

Haoran, X. W. . Y. (2022). Python libraries for data analysis and machine learning. Accessed: 2024-10-25.

Harris, C. R., Millman, K. J., van der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., Wieser, E., Taylor, J., Berg, S., Smith, N. J., Kern, R., Picus, M., Hoyer, S., van Kerkwijk, M. H., Brett, M., Haldane, A., del Río, J. F., Wiebe, M., Peterson, P., Gérard-Marchant, P., Sheppard, K., Reddy, T., Weckesser, W., Abbasi, H., Gohlke, C., and Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825):357–362.

Ito, L. G., Moreira, M. H. I., Souza, S. B., Medeiros, S. P., and Lima, P. (2022). What are the top used modules in python open-source projects? Anais do Computer on the Beach, 13:037–044.

Jenis, J., Ondriga, J., Hrcek, S., Brumercik, F., Cuchor, M., and Sadovsky, E. (2023). Engineering applications of artificial intelligence in mechanical design and optimization. Machines, 11(6).

Joshi, A. and Tiwari, H. (2023). An overview of python libraries for data science: Manuscript received: 20 march 2023, accepted: 12 may 2023, published: 15 september 2023, orcid: 0000-0003-0873-3340, DOI: 10.33093/jetap.2023.5.2.10. Journal of Engineering Technology and Applied Physics, 5(2):85–90.

Larios Vargas, E., Aniche, M., Treude, C., Bruntink, M., and Gousios, G. (2020). Selecting third-party libraries: the practitioners’ perspective. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, page 245–256, New York, NY, USA. Association for Computing Machinery.

Lima, P., Guerra, E., Meirelles, P., Kanashiro, L., Silva, H., and Silveira, F. (2018). A metrics suite for code annotation assessment. Journal of Systems and Software, 137:163–183.

McKinsey & Company (2020). The State of AI in 2020. Acesso em: 26 nov. 2024.

Nguyen, P. T., Di Rocco, J., Di Ruscio, D., and Di Penta, M. (2020). Crossrec: Supporting software developers by recommending third-party libraries. Journal of Systems and Software, 161:110460.

Python Software Foundation (2024). The python tutorial: Modules. [link]. Acessado: 17-ago-2024.

Raschka, S., Patterson, J., and Nolet, C. (2020). Machine learning in python: Main developments and technology trends in data science, machine learning, and artificial intelligence.

Rodrigues, B. and Andrade, A. (2021). O potencial da inteligência artificial para o desenvolvimento e competitividade das empresas: uma scoping review. Gestão e Desenvolvimento, (29):381–422.

Saabith, A. S., Vinothraj, T., and Fareez, M. (2020). Popular python libraries and their application domains. International Journal of Advance Engineering and Research Development, 7(11).

Sundaram, J., Gowri, K., Devaraju, S., Gokuldev, S., Jayaprakash, S., Anandaram, H., Manivasagan, C., and Thenmozhi, M. (2023). An exploration of python libraries in machine learning models for data science. In Advanced Interdisciplinary Applications of Machine Learning Python Libraries for Data Science, pages 1–31. IGI Global.

TensorFlow (2024a). Keras: A high-level api for tensorflow. [link]. Acessado em: 12 dez. 2024.

TensorFlow (2024b). Tensorflow core: Convolutional neural networks. [link]. Acessado em: 12 dez. 2024.

TensorFlow (2024c). Tensorflow core: Working with rnns. [link]. Acessado em: 12 dez. 2024.

Tutko, A., Henley, A. Z., and Mockus, A. (2022). How are software repositories mined? a systematic literature review of workflows, methodologies, reproducibility, and tools. arXiv preprint arXiv:2204.08108.
Publicado
22/09/2025
RENO, Camila; CARDOSO, João Marcos; CORDEIRO, Viviane; MEIRELLES, Paulo; FRANCISCO, Phyllipe. A Study of Popular Artificial Intelligence Python Modules in Open Source Projects. In: WORKSHOP DE VISUALIZAÇÃO, EVOLUÇÃO E MANUTENÇÃO DE SOFTWARE (VEM), 13. , 2025, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 1-12. DOI: https://doi.org/10.5753/vem.2025.14202.