Similarity Search using the NK Interaction Graph
Abstract
A similarity search method based on the NK interaction graph is proposed. The NK interaction graph was originally employed for clustering and is built based on distance and spatial density of the objects in a dataset. Two variations of the method are investigated. In the two variations, k objects are returned by visiting vertices of the NK interaction graph from the initial vertex related to the example of the dataset that is closer to the object to be consulted. In NK A, the k objects related to vertices with edges incident to the initial vertex are returned. In NK B, k vertices are visited starting from the initial vertex. The next visited vertex is that one with edge incident to the current vertex and that is closest to the new object to be consulted. The k objects related to the visited vertices are returned. The proposed algorithms are compared with each other and with the search for similarity based only on distance. The experimental results indicate that the proposed methods present good performance when there are clusters with arbitrary shapes in the dataset.
References
CARPINETO, C. & ROMANO, G. (2012). "A survey of automatic query expansion in information retrieval", ACM Computing Surveys (CSUR), 44(1).
ESTER, M.; KRIEGEL, H.-P.; SANDER, J. & XU, X. (1996), “A density-based algorithm for discovering clusters in large spatial databases with noise”, In the Proc. of the 2nd ACM Int. Conf. Knowl. Discovery Data Min. (KDD), 226–231.
FRÄNTI, P. & SIERANOJA, S. (2018). “K-means properties on six clustering benchmark datasets”, Applied Intelligence, 48 (12), 4743-4759.
HRUSCHKA, E. R.; CAMPELLO, R. J. G. B.; FREITAS, A. A. & CARVALHO, A. C. P. L. F. (2009). “A survey of evolutionary algorithms for clustering”, IEEE Transactions on Systems, Man, and Cybernetics, Part C, 39(2): 133-155.
HSINCHUN, C.; CHIANG, R. H. L. & STOREY, V. C. (2012). "Business intelligence and analytics: from big data to big impact", MIS Quarterly, 36(4): 1165-1188.
RODRIGUEZ, A. & LAIO, A. (2014). “Clustering by fast search and find of density peaks,” Science, 344(6191): 1492–1496.
TINÓS, R.; ZHAO, L.; CHICANO, F. & WHITLEY, D. (2018). “NK hybrid genetic algorithm for clustering”, IEEE Transactions on Evolutionary Computation, 22(5): 748-761.
AHA, D. W.; KIBLER, D. & ALBERT, M.K. (1991). “Instance-based learning algorithms”, Machine Learning, 6(1): 37-66.
CARPINETO, C. & ROMANO, G. (2012). "A survey of automatic query expansion in information retrieval", ACM Computing Surveys (CSUR), 44(1).
ESTER, M.; KRIEGEL, H.-P.; SANDER, J. & XU, X. (1996), “A density-based algorithm for discovering clusters in large spatial databases with noise”, In the Proc. of the 2nd ACM Int. Conf. Knowl. Discovery Data Min. (KDD), 226–231.
FRÄNTI, P. & SIERANOJA, S. (2018). “K-means properties on six clustering benchmark datasets”, Applied Intelligence, 48 (12), 4743-4759.
HRUSCHKA, E. R.; CAMPELLO, R. J. G. B.; FREITAS, A. A. & CARVALHO, A. C. P. L. F. (2009). “A survey of evolutionary algorithms for clustering”, IEEE Transactions on Systems, Man, and Cybernetics, Part C, 39(2): 133-155.
HSINCHUN, C.; CHIANG, R. H. L. & STOREY, V. C. (2012). "Business intelligence and analytics: from big data to big impact", MIS Quarterly, 36(4): 1165-1188.
RODRIGUEZ, A. & LAIO, A. (2014). “Clustering by fast search and find of density peaks,” Science, 344(6191): 1492–1496.
TINÓS, R.; ZHAO, L.; CHICANO, F. & WHITLEY, D. (2018). “NK hybrid genetic algorithm for clustering”, IEEE Transactions on Evolutionary Computation, 22(5): 748-761.
