Evaluating the Importance of the contributing.md File in Open-Source Projects

  • Silvana de Andrade Gonçalves IFAC
  • Alexandre Plastino UFF
  • Daricélio Moreira Soares UFAC

Abstract


This article investigates whether the inclusion and updates on the contributing.md file in open source projects influence the participation of new contributors. Using a temporal association rules analysis methodology on pull request data, the study evaluates the impact of these creation and changes on the file, implemented by the project’s core team. The results indicate that the updates and active use of the contributing.md file facilitate the participation of new collaborators, reducing the chances of their contributions being rejected. Furthermore, the analysis reveals significant temporal variations of these chances, providing a more detailed understanding of the factors that affect the acceptance and lifetime of pull requests over time.

References

Agrawal, R. (1994). Fast algorithms for mining association rules. In In Proceedings of the 20th International Conference on Very Large Data Bases, pages 487–499.

Ale, J. M. and Rossi, G. H. (2000). An approach to discovering temporal association rules. In Procs. of the 2000 ACM Symp. on Applied computing-Volume 1, pages 294–300.

Ford, D., Behroozi, M., Serebrenik, A., and Parnin, C. (2019). Beyond the code itself: How programmers really look at pull requests. In 2019 IEEE/ACM 41st Int. Conf. on Software Engineering: Software Engineering in Society (ICSE-SEIS), pages 51–60.

Fronchetti, F., Shepherd, D. C., Wiese, I., Treude, C., Gerosa, M. A., and Steinmacher, I. (2023). Do contributing files provide information about oss newcomers’ onboarding barriers? In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 16–28.

Gonçalves, S., Soares, D., and Silva, D. (2021). Temporal analysis on pull request patterns: an approach with sliding window. In Proceedings of the 15th Brazilian Symposium on Software Components, Architectures, and Reuse, pages 90–99.

Gousios, G., Pinzger, M., and Deursen, A. v. (2014). An exploratory study of the pull-based software development model. In Proceedings of the 36th international conference on software engineering, pages 345–355.

Kotsiantis, S. and Kanellopoulos, D. (2006). Association rules mining: A recent overview. GESTS Int. Transactions on Computer Science and Engineering, pages 71–82.

Liu, X., Feng, F., Wang, Q., Yager, R. R., Fujita, H., and Alcantud, J. C. R. (2021). Mining temporal association rules with temporal soft sets. Journal of Mathematics, page 7303720.

Rastogi, A., Nagappan, N., Gousios, G., and van der Hoek, A. (2018). Relationship between geographical location and evaluation of developer contributions in github. In Proceedings of the 12th ACM/IEEE international symposium on empirical software engineering and measurement, pages 1–8.

Segura-Delgado, A., Gacto, M. J., Alcalá, R., and Alcalá-Fdez, J. (2020). Temporal association rule mining: An overview considering the time variable as an integral or implied component. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, page e1367.

Soares, D. M., de Lima Júnior, M. L., Murta, L., and Plastino, A. (2015). Acceptance factors of pull requests in open-source projects. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, pages 1541–1546.

Soares, D. M., de Lima Júnior, M. L., Murta, L., and Plastino, A. (2021). What factors influence the lifetime of pull requests? Software: Practice and Experience, pages 1173–1193.

Soares, D. M., de Lima Júnior, M. L., Plastino, A., and Murta, L. (2018). What factors influence the reviewer assignment to pull requests? Information and Software Technology, pages 32–43.

Steinmacher, I., Pinto, G., Wiese, I. S., and Gerosa, M. A. (2018). Almost there: A study on quasi-contributors in open source software projects. In Proceedings of the 40th international conference on software engineering, pages 256–266.

Tsay, J., Dabbish, L., and Herbsleb, J. (2014). Influence of social and technical factors for evaluating contribution in github. In Proceedings of the 36th international conference on Software engineering, pages 356–366.

Zhang, X., Rastogi, A., and Yu, Y. (2020). On the shoulders of giants: A new dataset for pull-based development research. In Proceedings of the 17th international conference on mining software repositories, pages 543–547.
Published
2025-07-20
GONÇALVES, Silvana de Andrade; PLASTINO, Alexandre; SOARES, Daricélio Moreira. Evaluating the Importance of the contributing.md File in Open-Source Projects. In: INTEGRATED SOFTWARE AND HARDWARE SEMINAR (SEMISH), 52. , 2025, Maceió/AL. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 13-24. ISSN 2595-6205. DOI: https://doi.org/10.5753/semish.2025.6870.