Integrating ScientificWorkflows with Scientific Gateways: A Bioinformatics Experiment in the Brazilian National High-Performance Computing Network

  • Maria Luiza Mondelli LNCC
  • Marcelo Monteiro Galheigo LNCC
  • Vívian Medeiros LNCC
  • Bruno F. Bastos LNCC
  • Antônio Tadeu Azevedo Gomes LNCC
  • Marta Mattoso UFRJ
  • Ana Tereza R. Vasconcelos LNCC
  • Luiz M. R. Gadelha Jr. LNCC

Resumo


Bioinformatics experiments are rapidly and constantly evolving due improvements in sequencing technologies. These experiments usually demand high performance computation and produce huge quantities of data. They also require different programs to be executed in a certain order, allowing the experiments to be modeled as workflows. However, users do not always have the infrastructure needed to perform these experiments. Our contribution is the integration of scientific workflow management systems and grid-enabled scientific gateways, providing the user with a transparent way to run these workflows in geographically distributed computing resources. The availability of the workflow through the gateway allows for a better usability of these experiments.


 

Referências

de Oliveira, D., Ogasawara, E., Bai˜ao, F., and Mattoso, M. (2010). SciCumulus: A Lightweight Cloud Middleware to Explore Many Task Computing Paradigm in Scientific Workflows. In 2010 IEEE 3rd International Conference on Cloud Computing, pages 378–385. IEEE.

Foster, I. (2001). The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15(3):200–222.

Freire, J., Koop, D., Santos, E., and Silva, C. (2008). Provenance for Computational Tasks: A Survey. Computing in Science & Engineering, 10(3):11–21.

Gadelha, L. M. R., Wilde, M., Mattoso, M., and Foster, I. (2011). Exploring provenance in high performance scientific computing. In Proc. of the 1st AnnualWorkshop on High Performance Computing meets Databases - HPCDB ’11, pages 17–20. ACM Press.

Gadelha, L. M. R., Wilde, M., Mattoso, M., and Foster, I. (2012). MTCProv: a practical provenance query framework for many-task scientific computing. Distributed and Parallel Databases, 30(5-6):351–370.

Garlan, D., Monroe, R., and Wile, D. (2010). Acme. In CASCON First Decade High Impact Papers on - CASCON ’10, pages 159–173. ACM Press.

Gomes, A. T. A., Bastos, B. F., Medeiros, V., and Moreira, V. M. (2015). Experiences of the Brazilian national high-performance computing network on the rapid prototyping of science gateways. Concurrency and Computation: Practice and Experience, 27(2):271–289.

Lima, M. J. d., Melcop, T., Cerqueira, R., Cassino, C., Silvestre, B., Nery, M., and Ururahy, C. (2005). CSGrid: um sistema para integrac¸ ˜ao de aplicac¸ ˜oes em grades computacionais. In Sal˜ao de Ferramentas do XXIII SBRC. Anais do XXIII SBRC, pages 1207–1214.

Liu, J., Pacitti, E., Valduriez, P., and Mattoso, M. (2015). A Survey of Data-Intensive Scientific Workflow Management. Journal of Grid Computing, 13(4):457–493.

Madduri, R. K., Sulakhe, D., Lacinski, L., Liu, B., Rodriguez, A., Chard, K., Dave, U. J., and Foster, I. T. (2014). Experiences building Globus Genomics: a next-generation sequencing analysis service using Galaxy, Globus, and Amazon Web Services. Concurrency and Computation: Practice and Experience, 26(13):2266–2279.

Medeiros, V. and Gomes, A. T. A. (2012). Towards Fully Configurable Support to Non-Functional Attributes in Scientific Workflows. In IEEE eScience Early Results and Works-in-Progress Poster Papers, pages 2–3.

Medeiros, V. and Gomes, A. T. A. (2013). Expressando Atributos N˜ao-Funcionais em Workflows Cient´ıficos. In Proc. of VII Brazilian e-Science Workshop. Mondelli, M. L., Torre˜no, O., Oca˜na, K. A. C. S., Mattoso, M., Wilde, M., Vasconcellos, A. T., Trelles, O., and Gadelha, L. M. R. (2015). SwiftGECKO: a provenance-enabled parallel comparative genomics workflow. In Proceedings X-Meeting 2015, page 268.

Nadeem, F., Nerieri, F., Podlipnig, S., Qin, J., Siddiqui, M., Truong, H.-l., and Villazon, A. (2007). ASKALON : A Development and Grid Workflows. In Workflows for e-Science, pages 450–471. Springer.

Ocana, K. A., Oliveira, D. d., Ogasawara, E., Davila, A. M., Lima, A. A., and Mattoso, M. (2011). SciPhy: A Cloud-Based Workflow for Phylogenetic Analysis of Drug Targets in Protozoan Genomes. In Advances in Bioinformatics and Computational Biology - 6th Brazilian Symposium on Bioinformatics, BSB 2011. Proceedings, pages 66–70.

Torreno, O. and Trelles, O. (2015). Breaking the computational barriers of pairwise genome comparison. BMC bioinformatics, 16(1):250.

Wilde, M., Hategan, M., Wozniak, J. M., Clifford, B., Katz, D. S., and Foster, I. (2011). Swift: A language for distributed parallel scripting. Parallel Computing, 37(9):633–652.

Wilkins-Diehr, N., Gannon, D., Klimeck, G., Oster, S., and Pamidighantam, S. (2008). TeraGrid Science Gateways and Their Impact on Science. Computer, 41(11):32–41.
Publicado
04/07/2016
MONDELLI, Maria Luiza; GALHEIGO, Marcelo Monteiro; MEDEIROS, Vívian; BASTOS, Bruno F.; GOMES, Antônio Tadeu Azevedo; MATTOSO, Marta; VASCONCELOS, Ana Tereza R.; GADELHA JR., Luiz M. R.. Integrating ScientificWorkflows with Scientific Gateways: A Bioinformatics Experiment in the Brazilian National High-Performance Computing Network. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 10. , 2016, Porto Alegre. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2016 . p. 277-284. ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2016.10010.