Desafios no apoio à composição de experimentos científicos em larga escala

  • Marta Mattoso UFRJ
  • Cláudia Werner UFRJ
  • Guilherme Horta Travassos UFRJ
  • Vanessa Braganholo UFRJ
  • Leonardo Murta UFF
  • Eduardo Ogasawara UFRJ
  • Frederico de Oliveira UFRJ
  • Wallace Martinho UFRJ

Resumo


Para que experimentos científicos em larga escala possam ser gerenciados, é necessário que um conjunto de funcionalidades esteja presente. Dentre essas funcionalidades está o apoio à composição dos experimentos, que inclui a concepção de workflows científicos. No entanto, pouco apoio é oferecido à concepção e instanciação para execução num Sistema de Gerência de Workflows, à reutilização, ao controle sobre a evolução dos workflows e à coleta de informações para proveniência de dados. Neste artigo apresentamos soluções para alguns destes problemas a partir de técnicas de Engenharia de Software e Banco de Dados. Resultados preliminares com experimentos reais apontam para a viabilidade dessa abordagem.

Referências

Adomavicius, G., Tuzhilin, E., (2005), "Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions", IEEE Transactions on Knowledge and Data Engineering, v. 17, p. 734-749.

Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S., (2004), "Kepler: an extensible system for design and execution of scientific workflows". In: Proceedings. 16th International Conference on Scientific and Statistical Database Management, p. 423-424, Santorini, Greece.

Beck, K., (1999), Extreme Programming Explained: Embrace Change. Addison-Wesley Professional.

Callahan, S. P., Freire, J., Santos, E., Scheidegger, C. E., Silva, C. T., Vo, H. T., (2006), "VisTrails: visualization meets data management". In: Proceedings of the 2006 ACM SIGMOD, p. 745-747, Chicago, IL, USA.

Conradi, R., Westfechtel, B., (1998), "Version Models for Software Configuration Management", ACM Computing Surveys, v. 30, n. 2

Couvares, P., Kosar, T., Roy, A., Weber, J., Wenger, K., (2007), "Workflow Management in Condor", Workflows for e-Science, Springer, p. 357-375.

Davis, A., Dieste, O., Hickey, A., Juristo, N., Moreno, A., (2006), "Effectiveness of Requirements Elicitation Techniques: Empirical Results Derived from a Systematic Review". In: 14th IEEE International Conference of Requirements Engineering, p. 179-188

Deelman, E., Gannon, D., Shields, M., Taylor, I., (2008), "Workflows and e-Science: An overview of workflow system features and capabilities", Future Generation Computer Systems (Jul.)

Ellkvist, T., Koop, D., Anderson, E. W., Freire, J., Silva, C., (2008), "Using Provenance to Support Real-Time Collaborative Design of Workflows", Provenance and Annotation of Data and Processes: 2nd International Provenance and Annotation Workshop, Salt Lake City, UT, USA, L/CS, Springer-Verlag, p. 266-279.

Estublier, J., (2000), "Software configuration management: a roadmap". In: Proceedings of the Conference on the Future of Software Engineering, p. 279-289, Limerick, Ireland.

Frakes, W., Kyo Kang, (2005), "Software reuse research: status and future", IEEE Transactions on Software Engineering, v. 31, n. 7, p. 529-536.

Freire, J., Koop, D., Santos, E., Silva, C. T., (2008), "Provenance for Computational Tasks: A Survey", Computing in Science and Engineering, v. 10, n. 3, p. 11-21.

GExp, (2009), Large Scale Managament of Scientific Experiments., [link].

Gil, Y., Deelman, E., Ellisman, M., Fahringer, T., Fox, G., Gannon, D., Goble, C., Livny, M., Moreau, L., et al., (2007), "Examining the Challenges of Scientific Workflows", Computer, v. 40, n. 12, p. 24-32.

Goderis, A., De Roure, D., Goble, C., Bhagat, J., Cruickshank, D., Fisher, P., Michaelides, D., Tanoh, F., (2008), "Discovering Scientific Workflows: The myExperiment Benchmarks", IEEE Transactions on Automation Science and Engineering

Goderis, A., Sattler, U., Lord, P., Goble, C., (2005), "Seven Bottlenecks to Workflow Reuse and Repurposing". In: The Semantic Web – ISWC 2005, p. 323-337, Galway, Ireland.

Guelfi, N., Mammar, A., (2006), "A formal framework to generate XPDL specifications from UML activity diagrams". In: Proceedings of the 2006 ACM SAC, p. 1224-1231, Dijon, France.

Hull, D., Wolstencroft, K., Stevens, R., Goble, C., Pocock, M. R., Li, P., Oinn, T., (2006), "Taverna: a tool for building and running workflows of services", Nucleic Acids Research, v. 34, n. Web Server issue, p. 729-732.

Koop, D., Scheidegger, C., Callahan, S., Freire, J., Silva, C., (2008), "VisComplete: Automating Suggestions for Visualization Pipelines", IEEE Transactions on Visualization and Computer Graphics, v. 14, n. 6, p. 1691-1698.

Marinho, A., Murta, L., Werner, C., Braganholo, V., Cruz, S. M. S. D., Mattoso, M., (2009), "A Strategy for Provenance Gathering in Distributed Scientific Workflows". In: IEEE International Workshop on Scientific Workflows, Los Angeles, California, United States.

Mattoso, M., Werner, C., Travassos, G., Braganholo, V., Murta, L., (2008), "Gerenciando Experimentos Científicos em Larga Escala". In: SEMISH - CSBC, Belém, Pará - Brasil.

Ogasawara, E., Murta, L., Werner, C., Mattoso, M., (2008), "Linhas de Experimentos: Reutilização e Gerência de Configuração em Workflows Científicos". In: 2 E-Science Workshop co-locado ao SBBD/SBES, Campinas, Brasil.

Ogasawara, E., Paulino, C., Murta, L., Werner, C., Mattoso, M., (2009a), "Experiment Line: Software Reuse in Scientific Workflows". In: Proceedings of the 21th international conference on Scientific and Statistical Database Management, p. 264–272, New Orleans, LA.

Ogasawara, E., Rangel, P., Murta, L., Werner, C., Mattoso, M., (2009b), "Comparison and Versioning of Scientific Workflows". In: Proceedings of the 2009 international workshop on Comparison and versioning of software models, Vancouver, Canada.

Oinn, T., Li, P., Kell, D. B., Goble, C., Goderis, A., Greenwood, M., Hull, D., Stevens, R., Turi, D., et al., (2007), "Taverna/myGrid: Aligning a Workflow System with the Life Sciences Community", Workflows for e-Science, Springer, p. 300-319.

Oliveira, F., Murta, L., Werner, C., Mattoso, M., (2008), "Using Provenance to Improve Workflow Design". In: 2nd International Provenance and Annotation Workshop - IPAW , p. 136 - 143, Salt Lake City, UT, USA.

Pressman, R. S., (2004), Software Engineering Software Engineering: A Practitioner's Approach. 6 ed. McGraw-Hill; 6 edition.

Roure, D. D., Goble, C., Stevens, R., (2007), "Designing the myExperiment Virtual Research Environment for the Social Sharing of Workflows". In: Proceedings of the 3rd IEEE International Conference on e-Science and Grid Computing, p. 603-610, Bangalore, India.

SBC, (2006). Grandes Desafios da Computação no Brasil: 2006-2016. Disponível em: [link]. Acesso em: 22 Jan 2009.

Srikant, R., Agrawal, R., (1996), "Mining Sequential Patterns: Generalizations and Performance Improvements". In: Proceedings of the 5th EDBT, p. 3-17

Taylor, I. J., Deelman, E., Gannon, D. B., Shields, M., (2006), Workflows for e-Science: Scientific Workflows for Grids. 1 ed. Springer.

Travassos, G. H., Barros, M. O., (2003), "Contributions of In Virtuo and In Silico Experiments for the Future of Empirical Studies in Software Engineering". In: Proc. of 2nd Workshop on Empirical Software Engineering the Future of Empirical Studies in Software Engineering, Roma

Zhao, Y., Dobson, J., Foster, I., Moreau, L., Wilde, M., (2005), "A notation and system for expressing and executing cleanly typed workflows on messy scientific data", ACM SIGMOD Record, v. 34, n. 3, p. 37-43.
Publicado
20/07/2009
MATTOSO, Marta; WERNER, Cláudia; TRAVASSOS, Guilherme Horta; BRAGANHOLO, Vanessa; MURTA, Leonardo; OGASAWARA, Eduardo; OLIVEIRA, Frederico de; MARTINHO, Wallace. Desafios no apoio à composição de experimentos científicos em larga escala. In: SEMINÁRIO INTEGRADO DE SOFTWARE E HARDWARE (SEMISH), 36. , 2009, Bento Gonçalves/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2009 . p. 307-321. ISSN 2595-6205.