Performance Analysis of Platforms Graph Processing

  • Daniel N. R. da Silva National Laboratory for Scientific Computing
  • Klaus Wehmuth National Laboratory for Scientific Computing
  • Carla Osthoff National Laboratory for Scientific Computing
  • Ana Paula Appel IBM Research
  • Artur Ziviani National Laboratory for Scientific Computing

Abstract


The analysis of complex networks, represented by graphs, finds applications in several areas of knowledge. In this paper, we analyze the performance of graph processing platforms, which represent different approaches to the problem. In our evaluation, we also consider analysis algorithms that are of great interest of the community (connectivity, centrality, and path), as well as a set of real and synthetic networks with diverse topological characteristics and dimensions. The obtained experimental results contribute with guidelines for those interest to better select the most efficient graph processing platform to their analytics interests.
Keywords: Complex networks, Graphs

References

Barabási, A. e Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439):509–512.

Chakrabarti, D. e Faloutsos, C. (2006). Graph mining: Laws, generators, and algorithms. ACM Computing Surveys, 38(1):2.

Cormen, T. H., Leiserson, C. E., Rivest, R. L., e Stein, C. (2009). Introduction to algorithms. MIT Press.

Doekemeijer, N. e Varbanescu, A. (2014). A survey of parallel graph processing frameworks. Technical report, Delft University of Technology.

Elser, B. e Montresor, A. (2013). An evaluation study of big data frameworks for graph processing. In Proc. IEEE Int. Conf. on Big Data, pages 60–67.

Erdős, P. e Rényi, A. (1959). On random graphs i. Publ. Mathematicae Debrecen, 6:290–297.

Gonzalez, J., Low, Y., Gu, H., Bickson, D., e Guestrin, C. (2012). PowerGraph: Distributed graph-parallel computation on natural graphs. In Proc. USENIX Symp. on Operating Systems Design and Implementation (OSDI), pages 17–30.

Gonzalez, J., Xin, R., Dave, A., Crankshaw, D., Franklin, M., e Stoica, I. (2014). GraphX: Graph processing in a distributed dataflow framework. In Proc. USENIX Symp. on Operating Systems Design and Implementation (OSDI), pages 599–613.

Graham, S., Kessler, P., e Mckusick, M. (1982). Gprof: A call graph execution profiler. In Proc. ACM SIGPLAN Notices, pages 120–126.

Guo, Y., Biczak, M., Varbanescu, A., Iosup, A., Martella, C., e Willke, T. (2014). How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In Proc. IEEE Int. Parallel and Distributed Processing Symp. (IPDS), pages 395–404.

Han, M., Daudjee, K., Ammar, K., Özsu, M., Wang, X., e Jin, T. (2014). An experimental comparison of pregel-like graph processing systems. In Proc. VLDB Endowment, pages 1047–1058.

Iosup, A., Hegeman, T., Ngai, W., Heldens, S., Pérez, A., Manhardt, T., Chafi, H., Capota, M., Sundaram, N., Anderson, M., et al. (2016). LDBC graphalytics: A benchmark for large-scale graph analysis on parallel and distributed platforms. Technical report, Delft University of Technology.

Jouili, S. e Vansteenberghe, V. (2013). An empirical comparison of graph databases. In Proc. ASE/IEEE Int. Conf. on Social Computing (SocialCom), pages 708–715.

Koch, J., Staudt, C., Vogel, M., e Meyerhenke, H. (2016). An empirical comparison of big graph frameworks in the context of network analysis. ArXiv e-prints, abs/1601.00289.

Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., e Ghahramani, Z. (2010). Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research, 11:985–1042.

Lu, Y., Cheng, J., Yan, D., e Wu, H. (2014). Large-scale distributed graph computing systems: An experimental evaluation. In Proc. VLDB Endowment, pages 281–292.

Lumsdaine, A., Gregor, D., Hendrickson, B., e Berry, J. (2007). Challenges in parallel graph processing. Parallel Processing Letters, 17(1):5–20.

Malewicz, G., Austern, M., e Bik, A. (2010). Pregel: A system for large-scale graph processing. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 135–145.

Newman, M. (2010). Networks: An introduction. Oxford University Press.

Page, L., Brin, S., Motwani, R., e Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical report, Stanford InfoLab.

Penrose, M. (2003). Random geometric graphs. Oxford University Press.

Pires, R., Rêgo, L., Macêdo, J., e Vidal, V. (2015). Processamento de grafos em big data. In Hara, C. S., Porto, F., e Ogasawawara, E., editors, Tópicos em Gerenciamento de Dados e Informações – SBBD 2015, chapter 1, pages 8–38. SBC

Satish, N., Sundaram, N., Patwary, M., Seo, J., Park, J., Hassaan, M., Sengupta, S., Yin, Z., e Dubey, P. (2014). Navigating the maze of graph analytics frameworks using massive graph datasets. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pages 979–990.

Silva, T. e Zhao, L. (2016). Machine learning in complex networks. Springer.

Staudt, C., Sazonovs, A., e Meyerhenke, H. (2014). NetworKit: An interactive tool suite for high-performance network analysis. ArXiv e-prints, abs/1403.3005.

Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M. J., Shenker, S., e Stoica, I. (2012). Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proc. USENIX Conf. on Networked Systems Design and Implementation (NSDI), pages 15–28.

Zhao, Y., Yoshigoe, K., Xie, M., Zhou, S., Seker, R., e Bian, J. (2014). Evaluation and analysis of distributed graph-parallel processing frameworks. Journal of Cyber Security and Mobility, 3(3):289–316.
Published
2016-10-04
DA SILVA, Daniel N. R.; WEHMUTH, Klaus; OSTHOFF, Carla; APPEL, Ana Paula; ZIVIANI, Artur. Performance Analysis of Platforms Graph Processing. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 31. , 2016, Salvador/BA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2016 . p. 16-27. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2016.24305.