Aplicação de Ontologias de Proveniência em Workflows Científicos: um Mapeamento Sistemático

  • Luiz Gustavo Dias
  • Bruno Lopes
  • Daniel de Oliveira

Resumo


Experimentos científicos modelados como workflows são executados por complexos mecanismos chamados de Sistemas de Gerência de Workflows (SGWf). Existem diversos SGWfs com seus prós e contras, porém todos compartilham diversas características como por exemplo, a necessidade de fornecer apoio para os cientistas analisarem seus dados. Os dados de proveniência tem um papel importante no fornecimento das informações necessárias em diferentes etapas experimentais. Desta forma, o presente trabalho tem como objetivo mapear e caracterizar abordagens que utilizam uma das quatro ontologias de proveniência selecionadas, analisando fatores como adequabilidade, requisitos de execução e arquitetura. Após o estudo, percebeu-se que as ontologias de proveniência podem ser aplicadas em diferentes etapas do ciclo de vida do workflow científico, mas principalmente na fase de análise.

Referências

Atkinson, M. P., Gesing, S., Montagnat, J., and Taylor, I. J. (2017). Scientific workflows: Past, present and future. Future Generation Comp. Syst., 75:216–227.

Bada, M., Stevens, R., Goble, C., Gil, Y., Ashburner, M., Blake, J. A., Cherry, J. M., Harris, M., and Lewis, S. (2004). A short study on the success of the gene ontology. Web Semantics: Science, Services and Agents on the World Wide Web, 1(2):235–240.

Ciccarese, P., Soiland-Reyes, S., Belhajjame, K., Gray, A. J., Goble, C., and Clark, T. (2013). Pav ontology: provenance, authoring and versioning. Journal of biomedical semantics,
4(1):37.

Davidson, S. B. and Freire, J. (2008). Provenance and scientific workflows: challenges and opportunities. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1345–1350. ACM.

de Oliveira, D., Ogasawara, E. S., Dias, J., Bai˜ao, F. A., and Mattoso, M. (2012). Ontologybased semi-automatic workflow composition. Journal of Information and Data Management, 3(1):61–72.

de Oliveira, W. M., de Oliveira, D., and Braganholo, V. (2018). Provenance analytics for workflow-based computational experiments: A survey. ACM Comput. Surv., 51(3):53:1–
53:25.

Feng, C.-C. (2013). Mapping geospatial metadata to open provenance model. IEEE transactions on geoscience and remote sensing, 51(11):5073–5081.

Frank, M. and Zander, S. (2016). Smart web services for big spatio-temporal data in geographical information systems. In SALAD@ ESWC.

Freire, J., Koop, D., Santos, E., and Silva, C. T. (2008). Provenance for computational tasks: A survey. Computing in Science & Engineering, 10(3).

Gesing, S., Dooley, R., Pierce, M. E., Kr¨uger, J., Grunzke, R., Herres-Pawlis, S., and Hoffmann, A. (2018). Gathering requirements for advancing simulations in HPC infrastructures via science gateways. Future Generation Comp. Syst., 82:544–554.

Gesing, S., Wilkins-Diehr, N., Dahan, M., Lawrence, K. A., Zentner, M. G., Pierce, M. E., Hayden, L., and Marru, S. (2017). Science gateways: The long road to the birth of an institute. In 50th Hawaii International Conference on System Sciences, HICSS 2017, Hilton Waikoloa Village, Hawaii, USA, January 4-7, 2017.

Groth, P. and Moreau, L. (2011). Representing distributed systems using the open provenance model. Future Generation Computer Systems, 27(6):757–765.

Hoekstra, R. and Groth, P. (2014). Prov-o-viz-understanding the role of activities in provenance. In International Provenance and Annotation Workshop, pages 215–220. Springer.

Jagadish, H. V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J. M., Ramakrishnan, R., and Shahabi, C. (2014). Big data and its technical challenges. Commun. ACM, 57(7):86–94.

Jing, N. (2015). A prov-o based approach to web content provenance. In Logistics, Informatics and Service Sciences (LISS), 2015 International Conference on, pages 1–6. IEEE.

Karasavvas, K., Wolstencroft, K., Mina, E., Cruickshank, D., Williams, A. R., Roure, D. D., Goble, C. A., and Roos, M. (2012). Opening new gateways to workflows for life scientists. In HealthGrid Applications and Technologies Meet Science Gateways for Life Sciences, Proceedings of HealthGrid 2012, Amsterdam, The Netherlands, 21-23 May 2012., pages 131–141.

Kitchenham, B. (2004). Procedures for performing systematic reviews. Keele, UK, Keele University, 33(2004):1–26.

Kwasnikowska, N., Moreau, L., and Bussche, J. V. D. (2015). A formal account of the open provenance model. ACM Transactions on the Web (TWEB), 9(2):10.

Kwasnikowska, N. and Van den Bussche, J. (2008). Mapping the nrc dataflow model to the open provenance model. In International Provenance and Annotation Workshop, pages 3–16. Springer.

Liu, J., Pacitti, E., Valduriez, P., and Mattoso, M. (2015). A survey of data-intensive scientific workflow management. Journal of Grid Computing, 13(4):457–493.

Mattoso, M., Werner, C., Travassos, G. H., Braganholo, V., Ogasawara, E. S., de Oliveira, D., da Cruz, S. M. S., Martinho, W., and Murta, L. (2010). Towards supporting the life cycle of large scale scientific experiments. IJBPIM, 5(1):79–92.

Miles, S. (2011). Mapping attribution metadata to the open provenance model. Future Generation Computer Systems, 27(6):806–811.

Mizoguchi, R. (2004). Tutorial on ontological engineering part 2: Ontology development, tools and languages. New Generation Computing, 22(1):61–96.

Moreau, L., Clifford, B., Freire, J., Futrelle, J., Gil, Y., Groth, P., Kwasnikowska, N., Miles, S., Missier, P., Myers, J., et al. (2011). The open provenance model core specification (v1. 1). Future generation computer systems, 27(6):743–756.

Moreau, L., Freire, J., Futrelle, J., McGrath, R. E., Myers, J., and Paulson, P. (2008). The open provenance model: An overview. In International Provenance and Annotation Workshop, pages 323–326. Springer.

Pan, J., Lenhardt, C., Wilson, B., Palanisamy, G., Cook, R., and Shrestha, B. (2011). Geoscience data curation using a digital object model and open-source frameworks: Provenance applications. In Geoscience and Remote Sensing Symposium (IGARSS), 2011 IEEE International, pages 3815–3818. IEEE.

Ramos, L., Oca˜na, K., and Oliveira, D. (2016). Um sistema de informação para gerência de projetos científicos baseados em simulações computacionais. In Proceedings of the XII Brazilian Symposium on Information Systems, pages 216–223. ACM.

Schreiber, A., Ney, M., and Wendel, H. (2012). The provenance store proost for the open provenance model. In International Provenance and Annotation Workshop, pages 240–242. Springer.

Sheikh, U., Khan, A., Ahmed, B., Waheed, A., and Hameed, A. (2018). Provenance inference techniques: Taxonomy, comparative analysis and design challenges. Journal of Network and Computer Applications.

Simmhan, Y. and Barga, R. (2011). Analysis of approaches for supporting the open provenance model: A case study of the trident workflow workbench. Future Generation Computer Systems, 27(6):790–796.

Simmhan, Y., Groth, P., and Moreau, L. (2011). Special section: The third provenance challenge on using the open provenance model for interoperability. Future Generation Computer Systems, 27(6):737–742.

Wu, M. and Treloa, A. (2015). Metadata in research data australia and the open provenance model: A proposed mapping. In 21st International Congress on Modelling and Simulation, Gold Coast, Australia.
Publicado
24/06/2019
DIAS, Luiz Gustavo; LOPES, Bruno; DE OLIVEIRA, Daniel. Aplicação de Ontologias de Proveniência em Workflows Científicos: um Mapeamento Sistemático. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 13. , 2019, Belém. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2019.10031.