SmartLTM: Smart Larger-Than-Memory Storage for Hybrid Database Systems

  • Paulo R. P. Amora Universidade Federal do Ceará (UFC) http://orcid.org/0000-0001-5522-6080
  • Elvis M. Teixeira Universidade Federal do Ceará (UFC)
  • Francisco D. B. S. Praciano Universidade Federal do Ceará (UFC)
  • Javam C. Machado Universidade Federal do Ceará (UFC)

Resumo


Main-memory DBMS can offer hybrid and evolving storage architectures, instead of the traditional row or column storage layouts. Even if RAM is affordable nowadays, it is still a limited resource concerning available storage space in comparison to conventional storage devices. Due to this space restriction, techniques that leverage a trade-off between storage and query performance were developed and should be applied to data that is not frequently accessed or updated. This work proposes SmartLTM, a data eviction mechanism that considers the decisions previously taken by the DBMS in optimizing data storage according to query workload. We discuss how to migrate data, access it and the main differences between our approach and a row-based one. We also analyze the behavior of our solution in different storage media. Experiments show that cold data access with SmartLTM incurs an acceptable 17% of throughput loss, against 26% of the row-based one, while retrieving only half of the data to answer queries.
Palavras-chave: Hybrid Database Systems, Optimizing data storage, query workload

Referências

Ailamaki, A., DeWitt, D. J., and Hill, M. D. (2002). Data page layouts for relational databases on deep memory hierarchies. VLDB J., 11(3):198–215

Alagiannis, I., Borovica, R., Branco, M., Idreos, S., and Ailamaki, A. (2012). Nodb: efficient query execution on raw data files - read. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2012, Scottsdale, AZ, USA, May 20-24, 2012, pages 241–252.

Alagiannis, I., Idreos, S., and Ailamaki, A. (2014). H2O: a hands-free adaptive storeread. In International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, June 22-27, 2014, pages 1103–1114.

Appuswamy, R., Karpathiotakis, M., Porobic, D., and Ailamaki, A. (2017). The case for heterogeneous HTAP. In CIDR. https://www.cidrdb.org.

Arulraj, J., Pavlo, A., and Menon, P. (2016). Bridging the archipelago between row-stores and column-stores for hybrid workloads. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 583–598.

Bloom, B. H. (1970). Space/time trade-offs in hash coding with allowable errors. Commun. ACM, 13(7):422–426.

Cooper, B. F., Silberstein, A., Tam, E., Ramakrishnan, R., and Sears, R. (2010). Benchmarking cloud serving systems with YCSB. In SoCC, pages 143–154. ACM.

DeBrabant, J., Pavlo, A., Tu, S., Stonebraker, M., and Zdonik, S. B. (2013). Anti-caching: A new approach to database management system architecture. PVLDB, 6(14):1942–1953.

Difallah, D. E., Pavlo, A., Curino, C., and Cudré-Mauroux, P. (2013). Oltp-bench: An extensible testbed for benchmarking relational databases. PVLDB, 7(4):277–288.

Eldawy, A., Levandoski, J. J., and Larson, P. (2014). Trekking through siberia: Managing cold data in a memory-optimized database. PVLDB, 7(11):931–942.

Fan, B., Andersen, D. G., Kaminsky, M., and Mitzenmacher, M. (2014). Cuckoo filter: Practically better than bloom. In CoNEXT, pages 75–88. ACM.

Grund, M., Krüger, J., Plattner, H., Zeier, A., Cudré-Mauroux, P., and Madden, S. (2010). HYRISE - A main memory hybrid storage engine. PVLDB, 4(2):105–116.

Kemper, A. and Neumann, T. (2011). Hyper: A hybrid oltp&olap main memory database system based on virtual memory snapshots. In Proceedings of the 27th International Conference on Data Engineering, ICDE 2011, April 11-16, 2011, Hannover, Germany, pages 195–206.

Lang, H., Mühlbauer, T., Funke, F., Boncz, P. A., Neumann, T., and Kemper, A. (2016). Data blocks: Hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. In SIGMOD Conference, pages 311–326. ACM.

Ma, L., Arulraj, J., Zhao, S., Pavlo, A., Dulloor, S. R., Giardino, M. J., Parkhurst, J., Gardner, J. L., Doshi, K., and Zdonik, S. B. (2016). Larger-than-memory data management on modern storage hardware for in-memory OLTP database systems. In DaMoN, pages 9:1–9:7. ACM.

Moerkotte, G. (1998). Small materialized aggregates: A light weight index structure for data warehousing. In VLDB, pages 476–487. Morgan Kaufmann.

O’Neil, P. E., Cheng, E., Gawlick, D., and O’Neil, E. J. (1996). The log-structured mergetree (lsm-tree). Acta Inf., 33(4):351–385.

Pavlo, A., Angulo, G., Arulraj, J., Lin, H., Lin, J., Ma, L., Menon, P., Mowry, T. C., Perron, M., Quah, I., Santurkar, S., Tomasic, A., Toor, S., Aken, D. V., Wang, Z., Wu, Y., Xian, R., and Zhang, T. (2017). Self-driving database management systems. In CIDR. https://www.cidrdb.org.

Wu, Y., Arulraj, J., Lin, J., Xian, R., and Pavlo, A. (2017). An empirical evaluation of in-memory multi-version concurrency control. PVLDB, 10(7):781–792.
Publicado
25/08/2018
AMORA, Paulo R. P.; TEIXEIRA, Elvis M.; PRACIANO, Francisco D. B. S.; MACHADO, Javam C.. SmartLTM: Smart Larger-Than-Memory Storage for Hybrid Database Systems. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 33. , 2018, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2018 . p. 13-24. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2018.22215.