Governança de Dados em Sistemas-de-Sistemas: Uma Abordagem Orientada à Dados de Proveniência
Resumo
O desenvolvimento de Sistemas-de-Sistemas (SoS), que integram sistemas independentes por meio de fluxos claros de dados, tem crescido nos últimos anos. Apesar de vantagens como reúso e resiliência, SoSs enfrentam desafios na governança de dados, especialmente na ausência de mecanismos para controlar o ciclo de vida dos dados. Em SoSs, dados gerados por um sistema são usados por outros, dificultando a garantia de rastreabilidade, qualidade e integridade desde a coleta até o armazenamento. Este artigo propõe a PROVGov-SoS, uma abordagem de governança baseada na gerência de dados de proveniência. A proposta estrutura o fluxo de informações entre sistemas, permitindo que usuários compreendam o ciclo de vida dos dados no SoS. A abordagem foi avaliada em um estudo de viabilidade em um SoS real, com resultados promissores.
Palavras-chave:
proveniência, governança, sistemas-de-sistemas
Referências
Allen, M. D., Chapman, A., Seligman, L., and Blaustein, B. (2011). Provenance for collaboration: Detecting suspicious behaviors and assessing trust in information. In International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom), pages 342–351. IEEE.
Almeida, R., Silva, W. M. C. D., Castro, K., Araújo, A. P. F. D., Walter, M. E. M. T., Lifschitz, S., and Holanda, M. (2019). Managing data provenance for bioinformatics workflows using aprovbio. International Journal of Computational Biology and Drug Design, 12(2):153–170.
Anuyah, S., Bolade, V., and Agbaakin, O. (2024). Understanding graph databases: a comprehensive tutorial and survey. arXiv preprint arXiv:2411.09999.
Calabro, A., Daoudagh, S., Marchetti, E., Mayo, F., Marchiori, M., and Filipe, J. (2021). Mentors: Monitoring environment for system of systems. In WEBIST, pages 291–298.
Cavalcante, E., Batista, T., and Oquendo, F. (2024). Looking back and forward: A retrospective and future directions on software engineering for systems-of-systems. Journal of Software: Evolution and Process, 36(10):e2697.
Chreim, A., Yiwen, C., Smahi, A., Jiang, J., and Merzouki, R. (2024). Towards supervision of stochastic system of systems engineering: A multi-level hypergraph approach. IEEE Access.
Curry, E., Scerri, S., and Tuikka, T. (2022). Data spaces: design, deployment and future directions. Springer Nature.
Curry, E. and Sheth, A. (2018). Next-generation smart environments: From system of systems to data ecosystems. IEEE Intelligent Systems, 33(3):69–76.
de Oliveira, W. M., de Oliveira, D., and Braganholo, V. (2018). Provenance analytics for workflow-based computational experiments: A survey. ACM Comput. Surv., 51(3):1–25.
Fu, X., Wojak, A., Neagu, D., Ridley, M., and Travis, K. (2011). Data governance in predictive toxicology: A review. Journal of cheminformatics, 3:1–16.
Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995). Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., USA.
Gammack, D., Scott, S., and Chapman, A. P. (2016). Modelling provenance collection points and their impact on provenance graphs. In International Provenance and Annotation Workshop (IPAW), pages 146–157. Springer.
Gil, Y. and Miles, S. (2013). Prov model primer. [link].
Groth, P. and Moreau, L. (2013). Prov-overview. [link].
Hasan, R., Sion, R., and Winslett, M. (2009). Preventing history forgery with secure provenance. ACM Transactions on Storage (TOS), 5(4):1–43.
Herschel, M., Diestelkämper, R., and Ben Lahmar, H. (2017). A survey on provenance: What for? what form? what from? VLDB J., 26(6):881–906.
Huynh, T. D., Jewell, M. O., Keshavarz, A. S., Michaelides, D. T., Yang, H., and Moreaun, L. (2013). Prov-json serialization. [link].
Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., and Griswold, W. G. (2001). An overview of aspectj. In European Conference on Object-Oriented Programming (ECOOP), pages 327–354. Springer.
Kong, S., Lu, M., Li, L., and Gao, L. (2020). Runtime monitoring of software execution trace: Method and tools. IEEE Access, 8:114020–114036.
Kritzinger, L. M., Krismayer, T., Rabiser, R., and Grünbacher, P. (2019). A user study on the usefulness of visualization support for requirements monitoring. In Working Conference on Software Visualization (VISSOFT), pages 56–66. IEEE.
Lis, D. and Otto, B. (2020). Data governance in data ecosystems–insights from organizations. In Americas Conference on Information Systems (AMCIS).
Magagna, B., Goldfarb, D., Martin, P., Atkinson, M., Koulouzis, S., and Zhao, Z. (2020). Data provenance. In Towards Interoperable Research Infrastructures for Environmental and Earth Sciences: A Reference Model Guided Approach for Common Challenges, pages 208–225. Springer.
Maier, M. W. (1998). Architecting principles for systems-of-systems. Systems Engineering: The Journal of the International Council on Systems Engineering, 1(4):267–284.
Moreau, L. (2013). Provtoolbox. java library to create and convert w3c prov data model representations. [link].
Moreau, L., Batlajery, B. V., Huynh, T. D., Michaelides, D., and Packer, H. (2017). A templating system to generate provenance. IEEE Transactions on Software Engineering, 44(2):103–121.
Moreau, L. and Lebo, T. (2013). Prov-links. [link].
Moreau, L. and Missier, P. (2013). PROV-DM: The PROV Data Model. World Wide Web Consortium, W3C.
Neo4j (2025). Neo4j graph database. [link]. Accessado em 12 Mar. 2025.
Simmhan, Y. L., Plale, B., and Gannon, D. (2005). A survey of data provenance in e-science. SIGMOD Record, 34(3):31–36.
Singh, J., Cobbe, J., and Norval, C. (2019). Decision provenance: Harnessing data flow for accountable systems. ieee access 7 (2019), 6562–6574.
Vierhauser, M., Rabiser, R., Grünbacher, P., Seyerlehner, K., Wallner, S., and Zeisel, H. (2016). Reminds: A flexible runtime monitoring framework for systems of systems. Journal of Systems and Software, 112:123–136.
Wercelens, P., da Silva, W., Hondo, F., Castro, K., Walter, M. E., Araújo, A., Lifschitz, S., and Holanda, M. (2019). Bioinformatics workflows with nosql database in cloud computing. Evolutionary Bioinformatics, 15:1176934319889974.
Zhao, J., Miles, A., Klyne, G., and Shotton, D. (2009). Linked data and provenance in biological data webs. Briefings in bioinformatics, 10(2):139–152.
Almeida, R., Silva, W. M. C. D., Castro, K., Araújo, A. P. F. D., Walter, M. E. M. T., Lifschitz, S., and Holanda, M. (2019). Managing data provenance for bioinformatics workflows using aprovbio. International Journal of Computational Biology and Drug Design, 12(2):153–170.
Anuyah, S., Bolade, V., and Agbaakin, O. (2024). Understanding graph databases: a comprehensive tutorial and survey. arXiv preprint arXiv:2411.09999.
Calabro, A., Daoudagh, S., Marchetti, E., Mayo, F., Marchiori, M., and Filipe, J. (2021). Mentors: Monitoring environment for system of systems. In WEBIST, pages 291–298.
Cavalcante, E., Batista, T., and Oquendo, F. (2024). Looking back and forward: A retrospective and future directions on software engineering for systems-of-systems. Journal of Software: Evolution and Process, 36(10):e2697.
Chreim, A., Yiwen, C., Smahi, A., Jiang, J., and Merzouki, R. (2024). Towards supervision of stochastic system of systems engineering: A multi-level hypergraph approach. IEEE Access.
Curry, E., Scerri, S., and Tuikka, T. (2022). Data spaces: design, deployment and future directions. Springer Nature.
Curry, E. and Sheth, A. (2018). Next-generation smart environments: From system of systems to data ecosystems. IEEE Intelligent Systems, 33(3):69–76.
de Oliveira, W. M., de Oliveira, D., and Braganholo, V. (2018). Provenance analytics for workflow-based computational experiments: A survey. ACM Comput. Surv., 51(3):1–25.
Fu, X., Wojak, A., Neagu, D., Ridley, M., and Travis, K. (2011). Data governance in predictive toxicology: A review. Journal of cheminformatics, 3:1–16.
Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995). Design patterns: elements of reusable object-oriented software. Addison-Wesley Longman Publishing Co., Inc., USA.
Gammack, D., Scott, S., and Chapman, A. P. (2016). Modelling provenance collection points and their impact on provenance graphs. In International Provenance and Annotation Workshop (IPAW), pages 146–157. Springer.
Gil, Y. and Miles, S. (2013). Prov model primer. [link].
Groth, P. and Moreau, L. (2013). Prov-overview. [link].
Hasan, R., Sion, R., and Winslett, M. (2009). Preventing history forgery with secure provenance. ACM Transactions on Storage (TOS), 5(4):1–43.
Herschel, M., Diestelkämper, R., and Ben Lahmar, H. (2017). A survey on provenance: What for? what form? what from? VLDB J., 26(6):881–906.
Huynh, T. D., Jewell, M. O., Keshavarz, A. S., Michaelides, D. T., Yang, H., and Moreaun, L. (2013). Prov-json serialization. [link].
Kiczales, G., Hilsdale, E., Hugunin, J., Kersten, M., Palm, J., and Griswold, W. G. (2001). An overview of aspectj. In European Conference on Object-Oriented Programming (ECOOP), pages 327–354. Springer.
Kong, S., Lu, M., Li, L., and Gao, L. (2020). Runtime monitoring of software execution trace: Method and tools. IEEE Access, 8:114020–114036.
Kritzinger, L. M., Krismayer, T., Rabiser, R., and Grünbacher, P. (2019). A user study on the usefulness of visualization support for requirements monitoring. In Working Conference on Software Visualization (VISSOFT), pages 56–66. IEEE.
Lis, D. and Otto, B. (2020). Data governance in data ecosystems–insights from organizations. In Americas Conference on Information Systems (AMCIS).
Magagna, B., Goldfarb, D., Martin, P., Atkinson, M., Koulouzis, S., and Zhao, Z. (2020). Data provenance. In Towards Interoperable Research Infrastructures for Environmental and Earth Sciences: A Reference Model Guided Approach for Common Challenges, pages 208–225. Springer.
Maier, M. W. (1998). Architecting principles for systems-of-systems. Systems Engineering: The Journal of the International Council on Systems Engineering, 1(4):267–284.
Moreau, L. (2013). Provtoolbox. java library to create and convert w3c prov data model representations. [link].
Moreau, L., Batlajery, B. V., Huynh, T. D., Michaelides, D., and Packer, H. (2017). A templating system to generate provenance. IEEE Transactions on Software Engineering, 44(2):103–121.
Moreau, L. and Lebo, T. (2013). Prov-links. [link].
Moreau, L. and Missier, P. (2013). PROV-DM: The PROV Data Model. World Wide Web Consortium, W3C.
Neo4j (2025). Neo4j graph database. [link]. Accessado em 12 Mar. 2025.
Simmhan, Y. L., Plale, B., and Gannon, D. (2005). A survey of data provenance in e-science. SIGMOD Record, 34(3):31–36.
Singh, J., Cobbe, J., and Norval, C. (2019). Decision provenance: Harnessing data flow for accountable systems. ieee access 7 (2019), 6562–6574.
Vierhauser, M., Rabiser, R., Grünbacher, P., Seyerlehner, K., Wallner, S., and Zeisel, H. (2016). Reminds: A flexible runtime monitoring framework for systems of systems. Journal of Systems and Software, 112:123–136.
Wercelens, P., da Silva, W., Hondo, F., Castro, K., Walter, M. E., Araújo, A., Lifschitz, S., and Holanda, M. (2019). Bioinformatics workflows with nosql database in cloud computing. Evolutionary Bioinformatics, 15:1176934319889974.
Zhao, J., Miles, A., Klyne, G., and Shotton, D. (2009). Linked data and provenance in biological data webs. Briefings in bioinformatics, 10(2):139–152.
Publicado
29/09/2025
Como Citar
ALMEIDA, Jéssica Monçôres de; BRAGANHOLO, Vanessa; DE OLIVEIRA, Daniel.
Governança de Dados em Sistemas-de-Sistemas: Uma Abordagem Orientada à Dados de Proveniência. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 40. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 182-195.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2025.247059.
