Implementing W2Share: Supporting Reproducibility and Quality Assessment in eScience

  • Lucas Carvalho UNICAMP
  • Joana Malaverri UNICAMP
  • Claudia Medeiros UNICAMP

Resumo


An open problem in scientific community is that of supporting reproducibility and quality assessment of scientific experiments. Solutions need to be able to help scientists to reproduce experimental procedures in a reliable manner and, at the same time, to provide mechanisms for documenting the experiments to enhance integrity and transparency. Moreover, solutions need to incorporate features that allow the assessment of procedures, data used and results of those experiments. In this context, we designed W2Share, a framework to meet these requirements. This paper introduces our first implementation of W2Share, which moreover guides scientists in step-by-step process to ensure reproducibility based on a script-to-workflow conversion strategy. W2Share also incorporates features that allow annotating experiments with quality information. We validate our prototype using a real-world scenario in Bioinformatics.

Referências

Belhajjame, K., Zhao, J., Garijo, D., Gamble, M., Hettne, K., et al. (2015). Using a suite of ontologies for preserving workflow-centric research objects. Web Semantics: Science, Services and Agents on the World Wide Web, 32:16–42.

Carvalho, L. A. M. C., Belhajjame, K., and Medeiros, C. B. (2016a). Converting scripts into reproducible workflow research objects. In Proc. of the IEEE 12th Int. Conf. on eScience, October 23-26, pages 71–80, Baltimore, MD, USA. IEEE.

Carvalho, L. A. M. C., Silveira, R. L., Pereira, C. S., Skaf, M. S., and Medeiros, C. B. (2016b). Provenance-based retrieval: Fostering reuse and reproducibility across scientific disciplines. In Proc. of the 6th IPAW, June 7-8, 2016, pages 183–186. Springer.

Chirigati, F., Rampin, R., Shasha, D. E., and Freire, J. (2016). Reprozip: Computational reproducibility with ease. In SIGMOD Conference, pages 2085–2088. ACM.

De Roure, D., Goble, C., and Stevens, R. (2007). Designing the myexperiment virtual research environment for the social sharing of workflows. In IEEE Int. Conf. on e- Science and Grid Computing, pages 603–610. IEEE.

Liu, J., Pacitti, E., Valduriez, P., and Mattoso, M. (2015). A survey of data-intensive scientific workflow management. Journal of Grid Computing, 13(4):457–493.

McPhillips, T., Song, T., Kolisnik, T., Aulenbach, S., Belhajjame, K., et al. (2015).

Yesworkflow: A user-oriented, language-independent tool for recovering workflow information from scripts. Int. Journal of Digital Curation, 10(1):298–313.

Missier, P., Woodman, S., Hiden, H., and Watson, P. (2016). Provenance and data differencing for workflow reproducibility analysis. Concurrency and Computation: Practice and Experience, 28(4):995–1015.

Murta, L., Braganholo, V., Chirigati, F., Koop, D., and Freire, J. (2014). noworkflow: Capturing and analyzing provenance of scripts. pages 71–83.

Palma, R., Hołubowicz, P., Corcho, O., Gómez-Pérez, J. M., and Mazurek, C. (2014).

Rohub - a digital library of research objects supporting scientists towards reproducible science. In Semantic Web Evaluation Challenge, pages 77–82. Springer.

Sousa, R. B. (2015). Quality flow: a collaborative quality-aware platform for experiments in escience. Master’s thesis, Institute of Computing - University of Campinas.

Sousa, R. B., Cugler, D. C., Malaverri, J. E. G., and Medeiros, C. B. (2014). A provenance-based approach to manage long term preservation of scientific data. In 2014 IEEE 30th Int. Conf. on Data Eng. Workshops (ICDEW), pages 162–133. IEEE.

Souza,W., Carvalho, B., Dogini, D., and Lopes-Cendes, I. (2014). Identification of differentially methylated genes potentially associated with neurological diseases. In ASHG 64th Annual Meeting, October 18-22, 2014. American Society of Human Genetics.

Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., et al. (2013). The taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Research, 41(W1):W557–W561.
Publicado
22/07/2017
CARVALHO, Lucas; MALAVERRI, Joana; MEDEIROS, Claudia. Implementing W2Share: Supporting Reproducibility and Quality Assessment in eScience. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 11. , 2017, São Paulo. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2017 . p. 5-12. ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2017.9916.