Attributes that may raise the occurrence of merge conflicts

Authors

DOI:

https://doi.org/10.5753/jserd.2021.1911

Keywords:

Version Control, Merge Conflicts, Conflict prediction

Abstract

Collaborative software development typically involves the use of branches. The changes made in different branches are usually merged, and direct and indirect conflicts may arise. Some studies are concerned with investigating ways to deal with merge conflicts and measuring the effort that this activity may require. However, the investigation of factors that may reduce the occurrence of conflicts needs more and deeper attention. This paper aims at identifying and analyzing attributes of past merges with and without conflicts to understand what may induce direct conflicts. We analyzed 182,273 merge scenarios from 80 projects written in eight different programming languages to find characteristics that increase the chances of a merge to have a conflict. We found that attributes such as the number of changed files, the number of commits, the number of changed lines, and the number of committers demonstrated to have the strongest influence in the occurrence of merge conflicts. Moreover, attributes in the branch that is being integrated seem to be more influential than the same attributes in the receiving branch. Additionally, we discovered positive correlations between the occurrence of conflicts and both the duration of the branch and the intersection of developers in both branches. Finally, we observed that PHP, JavaScript, and Java are more prone to conflicts.

Downloads

Download data is not yet available.

References

Accioly, P., Borba, P., and Cavalcanti, G. (2018). Understanding semi-structured merge conflict characteristics in open-source java projects. Empirical Software Engineering, 23(4):2051–2085.

Agrawal, R., Srikant, R., et al. (1994). Fast algorithms for mining association rules. In Proc. 20th int. conf. very large data bases, VLDB, volume 1215, pages 487–499. Citeseer.

Anderson, T. W. and Darling, D. A. (1954). A test of goodness of fit. Journal of the American statistical association, 49(268):765–769.

Bird, C., Zimmermann, T., and Teterev, A. (2011). A theory of branches as goals and virtual teams. In Proceedings of the 4th International Workshop on Cooperative and Human Aspects of Software Engineering, pages 53–56.

Brindescu, C., Ahmed, I., Jensen, C., and Sarma, A. (2020a). An empirical investigation into merge conflicts and their effect on software quality. Empirical Software Engineering, 25(1):562–590.

Brindescu, C., Ahmed, I., Leano, R., and Sarma, A. (2020b). Planning for untangling: Predicting the difficulty of merge conflicts. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE), pages 801–811. IEEE.

Brun, Y., Holmes, R., Ernst, M. D., and Notkin, D. (2011). Proactive detection of collaboration conflicts. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pages 168–178.

Chacon, S. and Hamano, J. (2009). Pro git, vol. 288. Berkeley, CA.

Costa, C., Figueiredo, J., Murta, L., and Sarma, A. (2016). Tipmerge: recommending experts for integrating changes across branches. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, pages 523–534.

Costa, C., Figueiredo, J. J., Ghiotto, G., and Murta, L. (2014). Characterizing the problem of developers’ assignment for merging branches. International Journal of Software Engineering and Knowledge Engineering, 24(10):1489–1508.

Dias, K., Borba, P., and Barreto, M. (2020). Understanding predictive factors for merge conflicts. Information and Software Technology, 121:106256.

Fayyad, U., Piatetsky-Shapiro, G., and Smyth, P. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3):37–37.

Fayyad, U. M. and Irani, K. B. (1992). On the handling of continuous-valued attributes in decision tree generation. Machine Learning, 8(1):87–102.

Ghiotto, G., Murta, L., Barros, M., and Van Der Hoek, A. (2018). On the nature of merge conflicts: a study of 2,731 open source java projects hosted by github. IEEE Transactions on Software Engineering, 46(8):892–915.

Gousios, G. and Zaidman, A. (2014). A dataset for pull-based development research. In Proceedings of the 11th Working Conference on Mining Software Repositories, pages 368–371.

Han, J., Kamber, M., and Pei, J. (2012). Data mining concepts and techniques (3rd uppl.). Leßenich, O., Siegmund, J., Apel, S., Kästner, C., and Hunsen, C. (2018). Indicators for merge conflicts in the wild: survey and empirical study. Automated Software Engineering, 25(2):279–313.

Lu, H., Feng, L., and Han, J. (2000). Beyond intratransaction association analysis: mining multidimensional intertransaction association rules. ACM Transactions on Information Systems (TOIS), 18(4):423–454.

Macbeth, G., Razumiejczyk, E., and Ledesma, R. D. (2011). Cliff’s delta calculator: A non-parametric effect size program for two groups of observations. Universitas Psychologica, 10(2):545–555.

Mann, H. B. and Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. The annals of mathematical statistics, pages 50–60.

Menezes, J. W., Trindade, B., Pimentel, J. F., Moura, T., Plastino, A., Murta, L., and Costa, C. (2020). What causes merge conflicts? In Proceedings of the 34th Brazilian Symposium on Software Engineering, pages 203–212.

Nagappan, N. and Ball, T. (2005). Use of relative code churn measures to predict system defect density. In Proceedings of the 27th international conference on Software engineering, pages 284–292.

Owhadi-Kareshk, M., Nadi, S., and Rubin, J. (2019). Predicting merge conflicts in collaborative software development. In 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 1–11. IEEE.

Romano, J., Kromrey, J. D., Coraggio, J., and Skowronek, J. (2006). Appropriate statistics for ordinal level data: Should we really be using t-test and cohen’sd for evaluating group differences on the nsse and other surveys. In annual meeting of the Florida Association of Institutional Research, volume 177.

Sarma, A., Redmiles, D. F., and Van Der Hoek, A. (2011). Palantir: Early detection of development conflicts arising from parallel code changes. IEEE Transactions on Software Engineering, 38(4):889–908.

Silva, D. A. N. d., Soares, D. M., and Gonçalves, S. A. (2020). Measuring unique changes: How do distinct changes affect the size and lifetime of pull requests? In Proceedings of the 14th Brazilian Symposium on Software Components, Architectures, and Reuse, pages 121–130.

Vale, G., Schmid, A., Santos, A. R., De Almeida, E. S., and Apel, S. (2020). On the relation between github communication activity and merge conflicts. Empirical Software Engineering, 25(1):402–433.

Zimmermann, T. (2007). Mining workspace updates in cvs. In Fourth International Workshop on Mining Software Repositories (MSR’07: ICSE Workshops 2007), pages 11–11. IEEE.

Zimmermann, T., Weisgerber, P., Diehl, S., and Zeller, A. (2004). Mining version histories to guide software changes. In Proceedings of the 26th International Conference on Software Engineering, ICSE’04, page 563–572, USA. IEEE Computer Society.

Downloads

Published

2021-10-25

How to Cite

Menezes, J. W., Trindade, B., Pimentel, J. F., Plastino, A., Murta, L., & Costa, C. (2021). Attributes that may raise the occurrence of merge conflicts. Journal of Software Engineering Research and Development, 9(1), 14:1 – 14:14. https://doi.org/10.5753/jserd.2021.1911

Issue

Section

Research Article