Capturing Distributed Provenance Metadata from Cloud-Based Scientific Workflows

Authors

  • Sergio Manuel Serra da Cruz UFRJ
  • Carlos Eduardo Paulino UFRJ
  • Daniel de Oliveira UFRJ
  • Maria Luiza Machado Campos UFRJ
  • Marta Mattoso UFRJ

DOI:

https://doi.org/10.5753/jidm.2011.1384

Keywords:

Provenance, Scientific Workflows, Cloud Computing, Metadata

Abstract

Workflows are scientific abstractions used in the modeling of scientific experiments. High performance computing environments such as clusters and grids are often required to run the experiments. Cloud computing is starting to be adopted by the scientific community. However, the cloud environment is still incipient in collecting and recording retrospective workflow provenance. This paper presents an approach to capturing distributed provenance metadata from cloud-based scientific workflows. The approach was implemented through an evolution of the Matrioshka architecture that was refactored for cloud environments. Preliminary results show that provenance metadata captured from the virtual components running at the cloud can aid scientists to manage and reproduce their large scale in silico experiments.

Downloads

Download data is not yet available.

Downloads

Additional Files

Published

2011-08-12

How to Cite

da Cruz, S. M. S., Paulino, C. E., de Oliveira, D., Campos, M. L. M., & Mattoso, M. (2011). Capturing Distributed Provenance Metadata from Cloud-Based Scientific Workflows. Journal of Information and Data Management, 2(1), 43. https://doi.org/10.5753/jidm.2011.1384

Issue

Section

SBBD 2010 Short Papers