SimiFlow: Uma Arquitetura para Agrupamento de Workflows por Similaridade
Resumo
Workflows científicos vêm sendo utilizados no apoio aos experimentos científicos. Workflows em um mesmo experimento normalmente apresentam pequenas variações, que são difíceis de gerenciar utilizando os Sistemas de Gerência de Workflows Científicos tradicionais. O conceito de Linha de Experimentos foi definido para suprir esta lacuna, pois possibilitam a representação de um experimento e promovem a composição de workflows científicos de maneira sistemática. No entanto, a linha de experimentos ainda não resolve o problema de sistematização de workflows pré-existentes. Neste sentido, este artigo apresenta o SimiFlow, uma arquitetura para comparação e agrupamento de workflows pré-existentes por similaridade visando a construção de linhas de experimentos por meio de abordagem ascendente.Referências
Altintas, I., Berkley, C., Jaeger, E., Jones, M., Ludascher, B., Mock, S., (2004), "Kepler: an extensible system for design and execution of scientific workflows". In: 16th SSDBM, p. 423-424, Santorini, Greece.
Bunke, H., Shearer, K., (1998), "A graph distance metric based on the maximal common subgraph", Pattern Recogn. Lett., v. 19, n. 3-4, p. 255-259.
Callahan, S. P., Freire, J., Santos, E., Scheidegger, C. E., Silva, C. T., Vo, H. T., (2006), "VisTrails: visualization meets data management". In: Proceedings of the 2006 ACM SIGMOD, p. 745-747, Chicago, IL, USA.
Cavalcanti, M. C., Targino, R., Baião, F., Rössle, S. C., Bisch, P. M., Pires, P. F., Campos, M. L. M., Mattoso, M., (2005), "Managing structural genomic workflows using web services", Data & Knowledge Engineering, v. 53, n. 1, p. 45-74.
Deelman, E., Gannon, D., Shields, M., Taylor, I., (2009), "Workflows and e-Science: An overview of workflow system features and capabilities", Future Generation Computer Systems, v. 25, n. 5, p. 528-540.
GExp, (2009), Brazilian project for supporting large scale management of scientific experiments, [link].
Goble, C. A., Roure, D. C. D., (2007), "myExperiment: social networking for workflow-using e-scientists". In: Proceedings of the 2nd workshop on Workflows in support of large-scale science, p. 1-2, Monterey, California, USA.
Jain, A. K., Murty, M. N., Flynn, P. J., (1999), "Data clustering: a review", ACM Comput. Surv., v. 31, n. 3, p. 264-323.
Larman, C., Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development. 3 ed. Prentice Hall PTR.
Mattoso, M., Werner, C., Travassos, G. H., Braganholo, V., Murta, L., Ogasawara, E., Oliveira, D., Cruz, S. M. S. D., Martinho, W., (2010), "Towards Supporting the Life Cycle of Large Scale Scientific Experiments", To be published in Int. J. Business Process Integration and Management, n. Special Issue on Scientific Workflows
Ogasawara, E., Paulino, C., Murta, L., Werner, C., Mattoso, M., (2009), "Experiment Line: Software Reuse in Scientific Workflows". In: 21th SSDBM, p. 264–272
Ohst, D., Welle, M., Kelter, U., (2003), "Differences between versions of UML diagrams". In: Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering, p. 227-236, Helsinki, Finland.
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M. R., et al., (2004), Taverna: a tool for the composition and enactment of bioinformatics workflows. Oxford Univ Press.
Santos, E., Lins, L., Ahrens, J. P., Freire, J., Silva, C. T., (2008), "A First Study on Clustering Collections of Workflow Graphs" Springer-Verlag, p. 160-173.
Seo, J., Seno, S., Takenaka, Y., Matsuda, H., (2007), "Retrieving Functionally Similar Bioinformatics Workflows Using TF-IDF Filtering", IPSJ Digital Courier, v. 3, p. 164– 173.
SiDiff, (2010), SiDiff, [link].
Uhrig, S., (2008), "Matching class diagrams: with estimated costs towards the exact solution?". In: Proceedings of the 2008 international workshop on Comparison and versioning of software models, p. 7-12, Leipzig, Germany.
Bunke, H., Shearer, K., (1998), "A graph distance metric based on the maximal common subgraph", Pattern Recogn. Lett., v. 19, n. 3-4, p. 255-259.
Callahan, S. P., Freire, J., Santos, E., Scheidegger, C. E., Silva, C. T., Vo, H. T., (2006), "VisTrails: visualization meets data management". In: Proceedings of the 2006 ACM SIGMOD, p. 745-747, Chicago, IL, USA.
Cavalcanti, M. C., Targino, R., Baião, F., Rössle, S. C., Bisch, P. M., Pires, P. F., Campos, M. L. M., Mattoso, M., (2005), "Managing structural genomic workflows using web services", Data & Knowledge Engineering, v. 53, n. 1, p. 45-74.
Deelman, E., Gannon, D., Shields, M., Taylor, I., (2009), "Workflows and e-Science: An overview of workflow system features and capabilities", Future Generation Computer Systems, v. 25, n. 5, p. 528-540.
GExp, (2009), Brazilian project for supporting large scale management of scientific experiments, [link].
Goble, C. A., Roure, D. C. D., (2007), "myExperiment: social networking for workflow-using e-scientists". In: Proceedings of the 2nd workshop on Workflows in support of large-scale science, p. 1-2, Monterey, California, USA.
Jain, A. K., Murty, M. N., Flynn, P. J., (1999), "Data clustering: a review", ACM Comput. Surv., v. 31, n. 3, p. 264-323.
Larman, C., Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development. 3 ed. Prentice Hall PTR.
Mattoso, M., Werner, C., Travassos, G. H., Braganholo, V., Murta, L., Ogasawara, E., Oliveira, D., Cruz, S. M. S. D., Martinho, W., (2010), "Towards Supporting the Life Cycle of Large Scale Scientific Experiments", To be published in Int. J. Business Process Integration and Management, n. Special Issue on Scientific Workflows
Ogasawara, E., Paulino, C., Murta, L., Werner, C., Mattoso, M., (2009), "Experiment Line: Software Reuse in Scientific Workflows". In: 21th SSDBM, p. 264–272
Ohst, D., Welle, M., Kelter, U., (2003), "Differences between versions of UML diagrams". In: Proceedings of the 9th European software engineering conference held jointly with 11th ACM SIGSOFT international symposium on Foundations of software engineering, p. 227-236, Helsinki, Finland.
Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M. R., et al., (2004), Taverna: a tool for the composition and enactment of bioinformatics workflows. Oxford Univ Press.
Santos, E., Lins, L., Ahrens, J. P., Freire, J., Silva, C. T., (2008), "A First Study on Clustering Collections of Workflow Graphs" Springer-Verlag, p. 160-173.
Seo, J., Seno, S., Takenaka, Y., Matsuda, H., (2007), "Retrieving Functionally Similar Bioinformatics Workflows Using TF-IDF Filtering", IPSJ Digital Courier, v. 3, p. 164– 173.
SiDiff, (2010), SiDiff, [link].
Uhrig, S., (2008), "Matching class diagrams: with estimated costs towards the exact solution?". In: Proceedings of the 2008 international workshop on Comparison and versioning of software models, p. 7-12, Leipzig, Germany.
Publicado
20/07/2010
Como Citar
SILVA, Vítor; CHIRIGATI, Fernando; MAIA, Kely; OGASAWARA, Eduardo; OLIVEIRA, Daniel de; BRAGANHOLO, Vanessa; MURTA, Leonardo; MATTOSO, Marta.
SimiFlow: Uma Arquitetura para Agrupamento de Workflows por Similaridade. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 4. , 2010, Belo Horizonte/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2010
.
p. 193-200.
ISSN 2763-8774.