SAVIME: A Database Management System for Simulation Data Analysis and Visualization

  • Hermano Lustosa LNCC
  • Fábio Porto LNCC
  • Patrick Valduriez Inria

Resumo


Limitations in current DBMSs prevent their wide adoption in scientific applications. In order to make scientific applications benefit from DBMS support, enabling declarative data analysis and visualization over scientific data, we present an in-memory array DBMS system called SAVIME. In this work we describe the system SAVIME, along with its data model. Our preliminary evaluation shows how SAVIME, by using a simple storage definition language (SDL) can outperform the state-of-the-art array database system, SciDB, during the process of data ingestion. We also show that is possible to use SAVIME as a storage alternative for a numerical solver without affecting its scalability.

Palavras-chave: Scientific application, declarative data analysis, visualization over scientific data, in-memory array DBMS, storage definition language (SDL)

Referências

Ahrens, J. (2015). Increasing scientific data insights about exascale class simulations under power and storage constraints. IEEE Computer Graphics and Applications, 35(2):8–11. DOI: https://doi.org/10.1109/MCG.2015.35

Baumann, P. (1994). Management of multidimensional discrete data. The VLDB Journal, 3(4):401–444. DOI: https://doi.org/10.1007/BF01231603

Baumann, P., Furtado, P., Ritsch, R., and Widmann, N. (1997). The rasdaman approach to multidimensional database management. In Proceedings of the 1997 ACM Symposium on Applied Computing, SAC ’97, pages 166–173, New York, NY, USA. ACM. DOI: https://doi.org/10.1145/331697.331732

Blanas, S., Wu, K., Byna, S., Dong, B., and Shoshani, A. (2014). Parallel data analysis directly on scientific file formats. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, SIGMOD ’14, pages 385–396, New York, NY, USA. ACM. DOI: https://doi.org/10.1145/2588555.2612185

Center, B. S. (2016). New hpc4e seismic test suite to increase the pace of development of new modelling and imaging technologies. [Online; accessed 01-feb-2018].

Cudre-Mauroux, P., Kimura, H., Lim, K.-T., Rogers, J., Simakov, R., Soroush, E., Velikhov, P., Wang, D. L., Balazinska, M., Becla, J., DeWitt, D., Heath, B., Maier, D., Madden, S., Patel, J., Stonebraker, M., and Zdonik, S. (2009). A demonstration of scidb: A science-oriented dbms. Proc. VLDB Endow., 2(2):1534–1537. DOI: https://doi.org/10.14778/1687553.1687584

Gomes, A. T. A., Pereira, W. S., Valentin, F., and Paredes, D. (2017). On the implementation of a scalable simulator for multiscale hybrid-mixed methods. CoRR, abs/1703.10435.

Gosink, L., Shalf, J., Stockinger, K., Wu, K., and Bethel, W. (2006). Hdf5-fastquery: Accelerating complex queries on hdf datasets using fast bitmap indices. SSDBM ’06, pages 149–158, Washington, DC, USA. IEEE Computer Society. DOI: https://doi.org/10.1109/ssdbm.2006.27

Group, T. H. (2017). Hdf5 - the hdf group. [Online; accessed 01-feb-2018].

Lustosa, H., Lemus, N., Porto, F., and Valduriez, P. (2017). TARS: An Array Model with Rich Semantics for Multidimensional Data. In ER FORUM 2017: Conceptual Modeling: Research In Progress, Valencia, Spain.

Marathe, A. P. and Salem, K. (1997). A language for manipulating arrays. In Proceedings of the 23rd International Conference on Very Large Data Bases, VLDB ’97, pages 46–55, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

Marathe, A. P. and Salem, K. (1999). Query processing techniques for arrays. In ACM SIGMOD Record, volume 28, pages 323–334. ACM. DOI: https://doi.org/10.1145/304182.304211

Oldfield, R. A., Moreland, K., Fabian, N., and Rogers, D. (2014). Evaluation of methods to integrate analysis into a large-scale shock shock physics code. In Proceedings of the 28th ACM International Conference on Supercomputing, ICS ’14, pages 83–92, New York, NY, USA. ACM. DOI: https://doi.org/10.1145/2597652.2597668

Papadopoulos, S., Datta, K., Madden, S., and Mattson, T. (2016). The tiledb array data storage manager. Proc. VLDB Endow., 10(4):349–360. DOI: https://doi.org/10.14778/3025111.3025117

Paradigm4 (2017). Scidb. [Online; accessed 01-feb-2018]

Unidata (2017). netcdf. [Online; accessed 01-feb-2018].

Xing, H., Floratos, S., Blanas, S., Byna, S., Prabhat, Wu, K., and Brown, P. (2017). Array- Bridge: Interweaving declarative array processing with high-performance computing. arXiv e-prints, page arXiv:1702.08327. DOI: https://doi.org/10.1109/icde.2018.00092

Zalipynis, R. A. R. (2018). Chronosdb: Distributed, file based, geospatial array dbms. Proc. VLDB Endow., 11(10):1247–1261. DOI: https://doi.org/10.14778/3231751.3231754
Publicado
07/10/2019
Como Citar

Selecione um Formato
LUSTOSA, Hermano; PORTO, Fábio; VALDURIEZ, Patrick. SAVIME: A Database Management System for Simulation Data Analysis and Visualization. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 34. , 2019, Fortaleza. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 85-96. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2019.8810.