Guidelines for Data Engineering Documentation in a DevDocOps Approach

  • Stephany Mendes Oliveira UFSCar
  • Daniel Lucrédio UFSCar

Resumo


The software development process has been studied since the beginning of technological evolution. Development practices have evolved, requiring processes capable of supporting intensive work, paving the way for agile methodologies. With the growing need for continuous integration (CI) and continuous deployment/delivery (CD), new data architectures have emerged, allowing for scalable, maintainable, and reusable environments, collectively known as DevOps (Development + Operations). In this context, the DevDocOps approach integrates continuous documentation into the software development lifecycle. However, little has been published regarding the benefits of this approach. To address this, an empirical study was conducted, applying findings from the literature to a real development environment by integrating continuous documentation into the data engineering development lifecycle. Based on feedback from developers and technical lead, the results highlight the importance of technical documentation in an agile development environment and demonstrate how automating this process can improve the quality and efficiency of software deliveries.
Palavras-chave: Documentation Automation, continuous deployment (CD), data engineering, technical documentation, DevDocOps

Referências

Emad Aghajani, Csaba Nagy, Olga Lucero Vega-Marquez, Mario Linares-Vasquez, Laura Moreno, Gabriele Bavota, and Michele Lanza. 2019. Software Documentation Issues Unveiled. Proceedings - International Conference on Software Engineering 2019-May, 1199–1210. DOI: 10.1109/ICSE.2019.00122

Len Bass. 2017. The Software Architect and DevOps. IEEE Software 35 (2017), 8–10. Issue 1. DOI: 10.1109/MS.2017.4541051

Hajer Berhouma. 2020. A Generic Model for Software Documentation and its Application in Embedded Systems Developed with Scrum. ACM International Conference Proceeding Series, 33–36. DOI: 10.1145/3436829.3436858

Franz Färber, Sang Kyun Cha, Jürgen Primsch, Christof Bornhövd, Stefan Sigg, and Wolfgang Lehner. 2012. SAP HANA database. ACM SIGMOD Record 40 (1 2012), 45–51. Issue 4. DOI: 10.1145/2094114.2094126

Robert Feldt and Ana Magazinius. 2010. Validity threats in empirical software engineering research - An initial survey. In Proceedings of the 22nd International Conference on Software Engineering and Knowledge Engineering. 374–379.

Leonardo Leite, Carla Rocha, Fabio Kon, Dejan Milojicic, and Paulo Meirelles. 2019. A Survey of DevOps Concepts and Challenges. ACM Comput. Surv. 52, 6, Article 127 (nov 2019), 35 pages. DOI: 10.1145/3359981

Mirna Muñoz and Mario Negrete Rodríguez. 2021. A guidance to implement or reinforce a DevOps approach in organizations: A case study. Journal of Software: Evolution and Process (2021), e2342. DOI: 10.1002/smr.2342 arXiv: [link]

Danilo Pianini and Alessandro Neri. 2021. Breaking down monoliths with Microservices and DevOps: an industrial experience report. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). 505–514. DOI: 10.1109/ICSME52107.2021.00051

Aneta Poniszewska-Marańda, Arkadiusz Zieliski, and Witold Marańda. 2020. Towards project documentation in agile software development methods. Lecture Notes on Data Engineering and Communications Technologies 30 (2020), 1–18. DOI: 10.1007/978-3-030-19069-9_1

C.V. Ramamoorthy, P. Bruce Berra, Barry Boehm, Peter c.c. Wang, Wesley Chu, and Gio Wiederhold. 1984. 1984 IEEE First International Conference on Data Engineering, IEEE Computer Society Press (Ed.). 1984 IEEE First International Conference on Data Engineering.

Sabbir M. Rashid, James P. McCusker, Paulo Pinheiro, Marcello P. Bax, Henrique O. Santos, Jeanette A. Stingone, Amar K. Das, and Deborah L. McGuinness. 2020. The semantic data dictionary – an approach for describing and annotating data. Data Intelligence 2 (10 2020), 443–486. Issue 4. DOI: 10.1162/dint_a_00058

Joe Reis and Matt Housley. 2023. Fundamentos de Engenharia de Dados. Novatec, São Paulo - SP.

Guoping Rong, Zefeng Jin, He Zhang, Youwen Zhang,Wenhua Ye, and Dong Shao. 2019. DevDocOps: Towards Automated Documentation for DevOps. Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP 2019 (2019), 243–252. DOI: 10.1109/ICSE-SEIP.2019.00034

Guoping Rong, Zefeng Jin, He Zhang, Youwen Zhang, Wenhua Ye, and Dong Shao. 2020. DevDocOps: Enabling continuous documentation in alignment with DevOps. Software: Practice and Experience 50, 3 (2020), 210–226. DOI: 10.1002/spe.2770 arXiv: [link]

Joachim Rossberg. 2019. An Overview of Azure DevOpsAzure DevOps. Apress, Berkeley, CA, 37–66. DOI: 10.1007/978-1-4842-4483-8_2

Qiwei Song, Xianglong Kong, Lulu Wang, and Bixin Li. 2020. An Empirical Investigation into the Effects of Code Comments on Issue Resolution. In 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). 921–930. DOI: 10.1109/COMPSAC48688.2020.0-150

A. Synko and A. Peleshchyshyn. 2020. Software development documenting – documentation types and standards. Scientific journal of the Ternopil national technical university 98 (2020), 120–128. Issue 2. DOI: 10.33108/visnyk_tntu2020.02.120

Theo Theunissen. 2020. Identifying Conditions for Effective Communication with Just Enough Documentation in Continuous Software Development.. In CAiSE (Doctoral Consortium). 11–20.

Theo Theunissen, Stijn Hoppenbrouwers, and Sietse Overbeek. 2022. Approaches for Documentation in Continuous Software Development. Complex Systems Informatics and Modeling Quarterly (10 2022), 1–27. Issue 32. DOI: 10.7250/csimq.2022-32.01

Theo Theunissen,Uwe van Heesch, and Paris Avgeriou. 2022. A mapping study on documentation in Continuous Software Development. Information and Software Technology 142 (2022), 106733. DOI: 10.1016/j.infsof.2021.106733

Mark Underwood. 2023. Continuous Metadata in Continuous Integration, Stream Processing and Enterprise DataOps. Data Intelligence 5 (12 2023), 275–288. Issue 1. DOI: 10.1162/dint_a_00193

Ram Mohan Vadavalasa. 2020. End to end CI/CD pipeline for Machine Learning. International Journal of Advance Research, Ideas and Innovations in Technology 6, 3 (06 2020).

ClaesWohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Bjöorn Regnell, and Anders Wesslén. 2000. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, Norwell, MA, USA.

Ravi Teja Yarlagadda. 2021. DevOps and Its Practices. International Journal of Creative Research Thoughts (IJCRT) 9 (2021), 111–119. Issue 3. [link]

Konrad Załęski. 2021. Modeling Concepts. In Data Modeling with SAP BW/4HANA 2.0. Springer, 67–96.
Publicado
30/09/2024
OLIVEIRA, Stephany Mendes; LUCRÉDIO, Daniel. Guidelines for Data Engineering Documentation in a DevDocOps Approach. In: SIMPÓSIO BRASILEIRO DE COMPONENTES, ARQUITETURAS E REUTILIZAÇÃO DE SOFTWARE (SBCARS), 18. , 2024, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 31-40. DOI: https://doi.org/10.5753/sbcars.2024.3834.