Reference Process for Integrating Data Science Workflows and Governance in Big Data Systems

  • Victória T. Oliveira UFC
  • Rossana M. de Castro Andrade UFC
  • Pedro Almir Martins Oliveira IFMA
  • Ismayle Santos UECE
  • Miguel Franklin de Castro UFC

Resumo


This article presents a structured workflow for the development of data-based analysis in the public sector, from the identification of the institutional challenges to evidence-based decision-making. The proposed process organizes the steps around three main profiles - public manager, administrator and data scientist - and integrates data governance practices, LGPD compliance and reproducible analytical processes. The technical method for adding new analytics to the institutional platform is also detailed, involving the preparation of scripts, the use of machine learning models and the publication of analytical products with automated visualizations. The proposal contributes to standardization, transparency and efficiency in the adoption of analytical intelligence by public bodies, promoting evidence-based decisions.

Palavras-chave: Big Data Systems, Software Engineering, Data Sciense, Workflows, Governance, Public Sector

Referências

Alaa Alsaig, Vangalur Alagar, Zaki Chammaa, and Nematollaah Shiri. 2019. Characterization and efficient management of big data in iot-driven smart city development. Sensors 19, 11 (2019), 2430.

Carol Andrade, Fernanda Campagnucci, José Borbolla Neto, José Macedo, Marianna Gonçalves, Mariana Zonari, Silvana Paula Martins de Melo, and Ticiana Linhares. 2021. A Era dos Dados para o Setor Público: Uma Nova Cultura Organizacional Analítica. ÍRIS; AWS Institute; Social Good Brasil; Open Knowledge Brasil; Branded Brain; Programa Cientista Chefe do Governo do Ceará, Brasil. Tema: Alfabetização em Dados.

Michela Arnaboldi and Giovanni Azzone. 2020. Data science in the design of public policies: dispelling the obscurity in matching policy demand and data offer. Heliyon 6, 6 (2020).

Otmane Azeroual and Radka Nacheva. 2023. Data Mesh for Managing Complex Big Data Landscapes and Enhancing Decision Making in Organizations.. In KMIS. 202–212.

Élcio Batista, RossanaMCAndrade, Ismayle S Santos, Tales P Nogueira, PedroAM Oliveira, Valeria Lelli, and Victória T Oliveira. 2024. Fortaleza city hall strategic planning based on data analysis and forecasting. In Congresso Ibero-Americano em Engenharia de Software (CIbSE). SBC, 433–436.

Manuel Pedro Rodríguez Bolívar and Albert J Meijer. 2016. Smart governance: Using a literature review and empirical analysis to build a research model. Social Science Computer Review 34, 6 (2016), 673–692.

Bin Cheng, Salvatore Longo, Flavio Cirillo, Martin Bauer, and Ernoe Kovacs. 2015. Building a big data platform for smart cities: Experience and lessons from santander. In 2015 IEEE International Congress on Big Data. IEEE, 592–599.

Ali Davoudian and Mengchi Liu. 2020. Big Data Systems: A Software Engineering Perspective. 53, 5, Article 110 (sep 2020), 39 pages. DOI: 10.1145/3408314

Nihit Goyal, Ola G El-Taliawi, and Michael Howlett. 2022. The prevalence of big data analytics in public policy: is there a research-pedagogy gap? In Emerging Pedagogies for Policy Education: Insights from Asia. Springer, 99–123.

Jens Kandt and Michael Batty. 2021. Smart cities, big data and urban policy: Towards urban analytics for the long run. Cities 109 (2021), 102992.

Bram Klievink, Bart-Jan Romijn, Scott Cunningham, and Hans de Bruijn. 2017. Big data in the public sector: Uncertainties and readiness. Information systems frontiers 19, 2 (2017), 267–283.

Saurabh Mishra, Mahendra Shinde, Aniket Yadav, Bilal Ayyub, and Anand Rao. 2024. An AI-Driven Data Mesh Architecture Enhancing Decision-Making in Infrastructure Construction and Public Procurement. arXiv preprint arXiv:2412.00224 (2024).

Martijn Poel, Eric T Meyer, and Ralph Schroeder. 2018. Big data for policymaking: Great expectations, but with limited progress? Policy & Internet 10, 3 (2018), 347–367.

Fajar Rahmanto, Ulung Pribadi, and Agus Priyanto. 2021. Big data: What are the implications for public sector Policy in society 5.0 era?. In IOP Conference Series: Earth and Environmental Science, Vol. 717. IOP Publishing, 012009.

Syed Iftikhar Hussain Shah, Vassilios Peristeras, and Ioannis Magnisalis. 2021. Government big data ecosystem: definitions, types of data, actors, and roles and the impact in public administrations. ACM Journal of Data and Information Quality 13, 2 (2021), 1–25.

Bhagya Nathali Silva, Murad Khan, Changsu Jung, Jihun Seo, Diyan Muhammad, Jihun Han, Yongtak Yoon, and Kijun Han. 2018. Urban planning and smart city decision management empowered by real-time data processing using big data analytics. Sensors 18, 9 (2018), 2994.

Tom Van Eijk, Indika Kumara, Dario Di Nucci, Damian Andrew Tamburri, and Willem-Jan Van den Heuvel. 2024. Architectural design decisions for self-serve data platforms in data meshes. In 2024 IEEE 21st International Conference on Software Architecture (ICSA). IEEE, 135–145.

RDWahyunengseh and S Hastjarjo. 2021. Big Data Analysis of Policies on Disaster Communication: Mapping the issues of communication and public responses in the government social media. In IOP conference series: Earth and environmental science, Vol. 717. IOP Publishing, 012004.
Publicado
23/09/2025
OLIVEIRA, Victória T.; ANDRADE, Rossana M. de Castro; OLIVEIRA, Pedro Almir Martins; SANTOS, Ismayle; CASTRO, Miguel Franklin de. Reference Process for Integrating Data Science Workflows and Governance in Big Data Systems. In: WORKSHOP BRASILEIRO DE ENGENHARIA DE SOFTWARE INTELIGENTE (ISE), 4. , 2025, Recife/PE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 43-48. DOI: https://doi.org/10.5753/ise.2025.14901.