Recognizing Power-law Graphs by Machine Learning Algorithms using a Reduced Set of Structural Features
The empirical study of large real world networks in the last 20 years showed that a variety of real-world graphs are power-law. There are evidence that optimization problems seem easier in these graphs; however, for a given graph, classifying it as power-law is a problem in itself. In this work, we propose using machine learning algorithms (KNN, SVM, Gradient Boosting and Random Forests) for this task. We suggest a graph representation based on [Canning et al. 2018], but using a much simplified set of structural properties of the graph. We compare the proposed representation with the one generated by the graph2vec framework. The experiments attained high accuracy, indicating that a reduced set of structural graph properties is enough for the presented problem.
Bollobás, B., Borgs, C., Chayes, J., and Riordan, O. (2003). Directed scale-free graphs. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’03, pages 132–139, Philadelphia, PA, USA. Society for Industrial and Applied Mathematics.
Bonner, S., Brennan, J., Theodoropoulos, G., Kureshi, I., and McGough, A. S. (2016). Deep topology classification: A new approach for massive graph classification. In 2016 IEEE International Conference on Big Data (Big Data), pages 3290–3297.
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., and Wiener, J. (2000). Graph structure in the web. Computer Networks: The International Journal of Computer and Telecommunications Networking, 33(1-6):309– 320.
Canning, J. P., Ingram, E. E., Nowak-Wolff, S., Ortiz, A. M., Ahmed, N. K., Rossi, R. A., Schmitt, K. R. B., and Soundarajan, S. (2018). Predicting Graph Categories from Structural Properties. arXiv e-prints, page arXiv:1805.02682.
Clauset, A., Shalizi, C. R., and Newman, M. E. J. (2009). Power-law distributions in empirical data. SIAM Review, 51(4):661–703.
Eubank, S., Kumar, V., Marathe, M. V., Srinivasan, A., and Wang, N. (2004). Structural and algorithmic aspects of massive social networks. In Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’04, pages 718–727. Society for Industrial and Applied Mathematics.
Faloutsos, M., Faloutsos, P., and Faloutsos, C. (1999). On power-law relationships of the internet topology. ACM SIGCOMM Computer Communication Review, 29(4):251– 262.
Foster, I., Ripeanu, M., and Iamnitchi, A. (2002). Mapping the gnutella network. IEEE Internet Computing, 6:50–57.
Gast, M. and Hauptmann, M. (2014). Approximability of the vertex cover problem in power-law graphs. Theoretical Computer Science, 516:60–70.
Gast, M., Hauptmann, M., and Karpinski, M. (2012). Improved approximation lower bounds for vertex cover on power law graphs and some generalizations. arXiv preprint arXiv:1210.2698.
Gast, M., Hauptmann, M., and Karpinski, M. (2015). Inapproximability of dominating set on power law graphs. Theoretical Computer Science, 562:436–452.
Ghosh, S., Das, N., Gonçalves, T., Quaresma, P., and Kundu, M. (2018). The journey of graph kernels through two decades. Computer Science Review, 27:88 – 111.
Grandjean, M. (2014). La connaissance est un réseau. Les cahiers du numérique, 10(3):37–54.
Guelzim, N., Bottani, S., Bourgine, P., and Képès, F. (2002). Topological and causal structure of the yeast transcriptional regulatory network. Nature Genetics, 31(1):60.
Holme, P. and Kim, B. J. (2002). Growing scale-free networks with tunable clustering. Physical Review E, 65(2):026107.
Kleinberg, J. and Lawrence, S. (2001). The structure of the web. Science, 294(5548):1849–1850.
Kleinberg, J. M., Kumar, R., Raghavan, P., Rajagopalan, S., and Tomkins, A. S. (1999). The web as a graph: Measurements, models, and methods. In Proceedings of the 5th Annual International Conference on Computing and Combinatorics, COCOON’99, pages 1–17, Berlin, Heidelberg. Springer-Verlag.
Kumar, R., Raghavan, P., Rajagopalan, S., Sivakumar, D., Tomkins, A., and Upfal, E. (2000). Stochastic models for the web graph. In Proceedings 41st Annual Symposium on Foundations of Computer Science, pages 57–65. IEEE.
Le, Q. and Mikolov, T. (2014). Distributed representations of sentences and documents. In International Conference on Machine Learning, pages 1188–1196.
Leskovec, J., Adamic, L. A., and Huberman, B. A. (2007). The dynamics of viral marketing. ACM Transactions on the Web (TWEB), 1(1).
Leskovec, J., Kleinberg, J., and Faloutsos, C. (2005). Graphs over time: Densification laws, shrinking diameters and possible explanations. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, KDD ’05, pages 177–187, New York, NY, USA. ACM.
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill, Inc., New York, NY, USA, 1 edition.
Narayanan, A., Chandramohan, M., Venkatesan, R., Chen, L., Liu, Y., and Jaiswal, S. (2017). graph2vec: Learning distributed representations of graphs. CoRR, abs/1707.05005.
Pham, T., Tran, T., Dam, H., and Venkatesh, S. (2017). Graph classification via deep learning with virtual nodes. CoRR, abs/1708.04357.
Rossi, R. A. and Ahmed, N. K. (2015). The network data repository with interactive graph analytics and visualization. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.
Siganos, G., Faloutsos, M., Faloutsos, P., and Faloutsos, C. (2003). Power laws and the as-level internet topology. IEEE/ACM Transactions on Networking, 11(4):514–524.
Silva, M. O. d., Gimenez-Lugo, G. A., and Da Silva, M. V. (2013). Vertex cover in complex networks. International Journal of Modern Physics C (IJMPC), 24(11):1–9.
Tixier, A. J., Nikolentzos, G., Meladianos, P., and Vazirgiannis, M. (2017). Classifying graphs as images with convolutional neural networks. CoRR, abs/1708.02218.
Vignatti, A. L. and da Silva, M. V. G. (2016). Minimum vertex cover in generalized random graphs with power law degree distribution. Theoretical Computer Science, 647:101–111.