Relational graph data management on the edge: Grouping vertices’ neighborhood with Edge-k

Resumo


As the amount of data represented as graph grows, several frameworks are employing relational databases to manage them. However, the existing solutions store graphs creating a row for each edge in an edge table. In this paper, we propose Edge-k, a novel storage approach that combines additional columns in the edges table, allowing to tune the number of edges stored in a single row by taking into account the overall neighborhood of the vertices, thus providing a better table organization. Compared to the existing approaches in the literature, experiments reveal that our proposal was able to reach a speedup of 66% over a representative real dataset and up to 57% in synthetic datasets when processing Single Source Shortest Path queries. Hence, our solution advances the state of the art in the context of graph data management within relational databases systems.
Palavras-chave: Relational Graph, Grouping Vertices

Referências

Barabási, A.-L. and Pósfai, M. (2016). Network science. Cambridge University Press.

Bornea, M. A., Dolby, J., Kementsietsidis, A., Srinivas, K., Dantressangle, P., Udrea, O., and Bhattacharjee, B. (2013). Building an efficient rdf store over a relational database. In Proceedings of the 2013 SIGMOD, pages 121–132, New York, NY, USA. ACM.

Chen, R. (2013). Managing massive graphs in relational DBMS. In 2013 International Conference on Big Data, pages 1–8, Santa Clara, CA, USA. IEEE.

Fan, J., Raj, A. G. S., and Patel, J. M. (2015). The case against specialized graph analytics engines. In Proceeding of the 2015 CIDR, Asilomar, CA, USA. Online Proceedings.

Gao, J., Jin, R., Zhou, J., Yu, J. X., Jiang, X., and Wang, T. (2011). Relational approach for shortest path discovery over large graphs. PVLDB Endowment, 5(4):358–369.

Gao, J., Zhou, J., Yu, J. X., and Wang, T. (2014). Shortest path computing in relational DBMSs. IEEE Transactions on Knowledge and Data Engineering, 26(4):997–1011.

Jindal, A., Madden, S., Castellanos, M., and Hsu, M. (2015). Graph analytics using vertica relational database. In International Conference on Big Data, pages 1191–1200. IEEE.

Kyrola, A., Blelloch, G., and Guestrin, C. (2012). Graphchi: Large-scale graph computation on just a pc. In Proceedings of the 10th Conference on Operating Systems Design and Implementation, pages 31–46, Berkeley, CA, USA. USENIX Association.

Levandoski, J. J. and Mokbel, M. F. (2009). RDF data-centric storage. In 2009 International Conference on Web Services, pages 911–918, Los Angeles, CA, USA. IEEE.

Neumann, T. and Weikum, G. (2010). The RDF-3X engine for scalable management of RDF data. The VLDB Journal, 19(1):91–113.

Silva, D. N. R. D., Wehmuth, K., Osthoff, C., Appel, A. P., and Ziviani, A. (2016). Análise de desempenho de plataformas de processamento de grafos. In 31º Simpósio Brasileiro de Banco de Dados, SBBD, pages 16–27, Salvador, BH, Brasil. SBC.

Sun, W., Fokoue, A., Srinivas, K., Kementsietsidis, A., Hu, G., and Xie, G. (2015). Sql-graph: An efficient relational-based property graph store. In Proceedings of the 2015 SIGMOD, pages 1887–1901, New York, NY, USA. ACM.
Publicado
02/10/2017
SCABORA, Lucas C.; OLIVEIRA, Paulo H.; KASTER, Daniel S.; TRAINA, Agma J. M.; TRAINA-JR, Caetano. Relational graph data management on the edge: Grouping vertices’ neighborhood with Edge-k. In: SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 32. , 2017, Uberlândia/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2017 . p. 124-135. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2017.171357.