ETL4LinkedProv: Managing Multigranular Linked Data Provenance
Keywords:ETL, Linked Data, RDF, Provenance, Workflows, LOD2
This article presents the ETL4LinkedProv approach to manage the collection and publication of provenance with distinct levels of granularity as Linked Data. The proposed approach uses ETL-workflows and a component named Provenance Collector Agent to collect two kinds of provenance (prospective and restrospective) integrating them with domain data. The component also set the granularity of the provenance to be captured. Furthermore, ETL4LinkedProv is evaluated in a real world scenario where governmental Brazilian agencies produce and publish public data sources as Linked Data. In this article we also measure the amount of the provenance generated in the runtime of ETL-workflows and in the number of published RDF triples.