Recommending Collaborators Based on Co–Changed Files: A Controlled Experiment

  • Kattiana Constantino UFMG
  • Raquel Prates UFMG
  • Eduardo Figueiredo UFMG


Active collaboration is essential for the success of software projects across the development life-cycle. However, social coding platforms, like GitHub, present challenges in finding potential collaborators with whom they could create new/stronger ties and enhance the quality of contributions. Thus, we conducted a controlled experiment with 35 participants. We asked participants to perform the experiment tasks to find collaborators with similar interests using a prototype recommendation tool and GitHub. We observed that results confirm that recommender based on co–changed files can provide suitable collaborator recommendations to developers of a specific project.


Avelino, G., Passos, L., Hora, A., and Valente, M. T. (2016). A novel approach for estimating truck factors. Proc. of the 24th Int. Conf. on Program Comprehension (ICPC), pages 1–10.

Barcomb, A., Stol, K.-J., Fitzgerald, B., and Riehle, D. (2020). Managing episodic volunteers in free/libre/open source software communities. IEEE Transactions on Soft. Eng., 48(1):260–277.

Barcomb, A., Stol, K.-J., Riehle, D., and Fitzgerald, B. (2019). Why do episodic volunteers stay in floss communities? In Proc. of the 41st Int. Conf. on Soft. Eng. (ICSE), pages 948–959.

Basili, V. R., Shull, F., and Lanubile, F. (1999). Building knowledge through families of experiments. IEEE Transactions on Soft. Eng., 25(4):456–473.

Basili, V. R. and Weiss, D. M. (1984). A methodology for collecting valid software engineering data. IEEE Transactions on Soft. Eng. (TSE), (6):728–738.

Bird, C. (2011). Sociotechnical coordination and collaboration in open source software. In Proc. of the 27th Int. Conf. on Software Maintenance (ICSM), pages 568–573.

Canfora, G., Di Penta, M., Oliveto, R., and Panichella, S. (2012). Who is going to mentor newcomers in open source projects? In Proc. of the 20th Int. Symposium on the Foundations of Soft. Eng. (FSE), pages 1–11.

Constantino, K., Belem, F., and Figueiredo, E. (2023). Dual analysis for helping developers to find collaborators based on co-changed files: An empirical study. Software: Practice and Experience, pages 1–27.

Constantino, K. and Figueiredo, E. (2022). Coopfinder: Finding collaborators based on co–changed files. Proc. of the IEEE Symposium on Visual Languages and Human Centric Computing (VL/HCC), pages 1–3. IEEE.

Constantino, K., Souza, M., Zhou, S., Figueiredo, E., and Kastner, C. (2021). Perceptions of open-source software developers on collaborations: An interview and survey study. Journal of Software: Evolution and Process, 33:e2393.

Constantino, K., Zhou, S., Souza, M., Figueiredo, E., and Kastner, C. (2020). Understanding collaborative software development: An interview study. In Proc. of the 15th Int. Conf. on Global Soft. Eng. (ICGSE), page 55–65.

Costa, C., Figueiredo, J., Pimentel, J. F., Sarma, A., and Murta, L. (2021). Recommending participants for collaborative merge sessions. IEEE Transactions on Soft. Eng., 47(6):1198–1210.

Crowston, K. and Fagnot, I. (2018). Stages of motivation for contributing user-generated content: A theory and empirical test. Int. Journal of Human-Computer Studies, 109:89–101.

Davis, F. D. (1989). Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS quarterly, pages 319–340.

Ferreira, M., Valente, M. T., and Ferreira, K. (2017). A comparison of three algorithms for computing truck factors. Proc. of the 25th Int. Conf. on Program Comprehension (ICPC), pages 207–217.

Fisher, R. A. (1992). The arrangement of field experiments. In Breakthroughs in Statistics, pages 82–91. Springer.

Flick, U. (2018). Designing Qualitative Research. Qualitative Research Kit.

Franco, M. F., Rodrigues, B., and Stiller, B. (2019). Mentor: The design and evaluation of a protection services recommender system. In Proc. of the 15th Int. Conf. on Network and Service Management (CNSM), pages 1–7.

Gamalielsson, J. and Lundell, B. (2014). Sustainability of open source software communities beyond a fork: How and why has the libreoffice project evolved? Journal of Systems and Software, 89:128–145.

Gousios, G., Pinzger, M., and Deursen, A. v. (2014). An exploratory study of the pullbased software development model. Proc. of the 36th Int. Conf. on Soft. Eng. (ICSE), pages 345–355.

Gousios, G., Storey, M.-A., and Bacchelli, A. (2016). Work practices and challenges inpull-based development: The contributor’s perspective. In Proc. of the 38th Int. Conf. on Soft. Eng. (ICSE), pages 285–296.

Gousios, G., Zaidman, A., Storey, M.-A., and Deursen, A. v. (2015). Work practices and challenges in pull-based development: The integrator’s perspective. Proc. of the 37th Int. Conf. on Soft. Eng. (ICSE), pages 358–368.

Jiang, J., He, J.-H., and Chen, X.-Y. (2015). Coredevrec: Automatic core member recommendation for contribution evaluation. Journal of Computer Science and Technology, 30(5):998–1016.

Kitchenham, B. A., Pfleeger, S. L., Pickard, L. M., Jones, P. W., Hoaglin, D. C., El Emam, K., and Rosenberg, J. (2002). Preliminary guidelines for empirical research in software engineering. IEEE Transactions on Soft. Eng., 28(8):721–734.

Kononenko, O., Baysal, O., and Godfrey, M. W. (2016). Code review quality: How developers see it. Proc. of the 38th Int. Conf. on Soft. Eng. (ICSE), pages 1028–1038.

Miller, R. and Siegmund, D. (1982). Maximally selected chi square statistics. Biometrics, pages 1011–1016.

Minto, S. and Murphy, G. (2007). Recommending emergent teams. In Proc. of the 4th Int. Conf. on Mining Software Repositories (MSR), pages 5–5.

Pham, R., Singer, L., Liskin, O., Figueira Filho, F., and Schneider, K. (2013). Creating a shared understanding of testing culture on a social coding site. Proc. of the 35th Int. Conf. on Soft. Eng. (ICSE), pages 112–121.

Pinto, G., Steinmacher, I., and Gerosa, M. (2016). More common than you think: An in depth study of casual contributors. Proc. of the 23rd Int. Conf. on Software Analysis, Evolution, and Reengineering (SANER), pages 112–123.

Qiu, H. S., Nolte, A., Brown, A., Serebrenik, A., and Vasilescu, B. (2019). Going farther together: The impact of social capital on sustained participation in open source. In Proc. of the 41st Int. Conf. on Soft. Eng. (ICSE), pages 688–699.

Rahman, M. M., Roy, C. K., Redl, J., and Collins, J. A. (2016). Correct: Code reviewer recommendation at github for vendasta technologies. In Proc. of the 31st Int. Conf. on Automated Soft. Eng. (ASE), page 792–797.

Ricci, F., Rokach, L., and Shapira, B. (2011). Introduction to recommender systems handbook. In Recommender Systems Handbook, pages 1–35.

Salton, G. (1971). The smart retrieval system: Experiments in automatic information retrieval.

Salton, G. (1989). Automatic text processing: The transformation, analysis, and retrieval of. Reading: Addison-Wesley, 169.

Salton, G. and Harman, D. (2003). Information retrieval. In Encyclopedia of Computer Science.

Shah, S. K. (2006). Motivation, governance, and the viability of hybrid forms in open source software development. Management science, 52(7):1000–1014.

Steinmacher, I., Pinto, G., Wiese, I. S., and Gerosa, M. A. (2018). Almost there: A study on quasi-contributors in open-source software projects. Proc. of the 40th Int. Conf. on Soft. Eng. (ICSE), pages 256–266.

Steinmacher, I., Silva, M. A. G., Gerosa, M. A., and Redmiles, D. F. (2015). A systematic literature review on the barriers faced by newcomers to open source software projects. Information and Software Technology, 59:67–85.

Surian, D., Liu, N., Lo, D., Tong, H., Lim, E.-P., and Faloutsos, C. (2011). Recommending people in developers’ collaboration network. In Proc. of the 18th Working Conf. on Reverse Engineering (WCRE), pages 379–388.

Tamburri, D. A., Kruchten, P., Lago, P., and Van Vliet, H. (2015). Social debt in software engineering: Insights from industry. Journal of Internet Services and Applications, 6(1):1–17.

Thongtanunam, P., Tantithamthavorn, C., Kula, R. G., Yoshida, N., Iida, H., and Matsumoto, K.-i. (2015). Who should review my code? a file location-based code-reviewer recommendation approach for modern code review. In Proc. of the 22nd Int. Conf. on Software Analysis, Evolution, and Reengineering (SANER), pages 141–150.

Wilcoxon, F. (1992). Individual comparisons by ranking methods. In Breakthroughs in statistics, pages 196–202.

Wohlin, C., Runeson, P., Höst, M., Ohlsson, M. C., and Regnell, B. (2012). Experimentation in Software Engineering.

Yu, Y., Wang, H., Filkov, V., Devanbu, P., and Vasilescu, B. (2015). Wait for it: Determinants of pull request evaluation latency on github. Proc. of the 12th Int. Conf. on Mining Software Repositories (MSR), pages 367–371.

Zhou, M. and Mockus, A. (2011). Does the initial environment impact the future of developers? In Proc. of the 33rd Int. Conf. on Soft. Eng. (ICSE), pages 271–280.
Como Citar

Selecione um Formato
CONSTANTINO, Kattiana; PRATES, Raquel; FIGUEIREDO, Eduardo. Recommending Collaborators Based on Co–Changed Files: A Controlled Experiment. In: SIMPÓSIO BRASILEIRO DE SISTEMAS COLABORATIVOS (SBSC), 18. , 2023, Rio de Janeiro/RJ. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 154-168. ISSN 2326-2842. DOI: