CoEPinKB: Evaluating Path Search Strategies in Knowledge Bases

Authors

DOI:

https://doi.org/10.5753/jbcs.2022.2211

Keywords:

Entity Relatedness, Similarity Measure, Relationship Path Ranking, Backward Search, Knowledge Base

Abstract

A knowledge base, expressed using the Resource Description Framework (RDF), can be viewed as a graph whose nodes represent entities and whose edges denote relationships. The entity relatedness problem refers to the problem of discovering and understanding how two entities are related, directly or indirectly, that is, how they are connected by paths in a knowledge base. Strategies designed to solve the entity relatedness problem typically adopt an entity similarity measure to reduce the path search space and a path ranking measure to order and filter the list of paths returned. This article presents a framework, called CoEPinKB, that supports the empirical evaluation of such strategies. The proposed framework allows combining entity similarity and path ranking measures to generate different path search strategies. The main goals of this article are to describe the framework and present a performance evaluation of nine different path search strategies.

Downloads

Download data is not yet available.

References

Bhalotia, G., Hulgeri, A., Nakhe, C., Chakrabarti, S., and Sudarshan, S. (2002). Keyword searching and browsing in databases using BANKS. In Proceedings 18th International Conference on Data Engineering, pages 431–440. DOI: 10.1109/ICDE.2002.994756.

Cheng, G., Liu, D., and Qu, Y. (2021). Fast Algorithms for Semantic Association Search and Pattern Mining. IEEE Transactions on Knowledge and Data Engineering, 33(4):1490–1502. DOI: 10.1109/TKDE.2019.2942031.

Cheng, G., Shao, F., and Qu, Y. (2017). An Empirical Evaluation of Techniques for Ranking Semantic Associations. IEEE Transactions on Knowledge and Data Engineering, 29(11):2388–2401. DOI: 10.1109/TKDE.2017.2735970.

Cheng, G., Zhang, Y., and Qu, Y. (2014). Explass: Exploring Associations between Entities via Top-K Ontological Patterns and Facets. In The Semantic Web – ISWC 2014, volume 8797, pages 422–437. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-11915-1_27.

Church, K. W. and Hanks, P. (1990). Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics, 16(1):22–29.

Cohen, W. W. (2010). Graph walks and graphical models, volume 5. Citeseer.

De Vocht, L., Beecks, C., Verborgh, R., Mannens, E., Seidl, T., and Van de Walle, R. (2016). Effect of Heuristics on Serendipity in Path-Based Storytelling with Linked Data. In Human Interface and the Management of Information: Information, Design and Interaction, volume 9734, pages 238–251. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-40349-6_23.

De Vocht, L., Coppens, S., Verborgh, R., Sande, M. V., Mannens, E., and de Walle, R. V. (2013). Discovering Meaningful Connections between Resources in the Web of Data. In Proceedings of the 6th Workshop on Linked Data on the Web (LDOW2013).

Fang, L., Sarma, A. D., Yu, C., and Bohannon, P. (2011). REX: explaining relationships between entity pairs. Proceedings of the VLDB Endowment, 5(3):241–252. DOI: 10.14778/2078331.2078339.

Heim, P., Hellmann, S., Lehmann, J., Lohmann, S., and Stegemann, T. (2009). RelFinder: Revealing Relationships in RDF Knowledge Bases. In Semantic Multimedia, volume 5887, pages 182–187. Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-10543-2_21.

Herrera, J. E. T. (2017). On the Connectivity of Entity Pairs in Knowledge Bases. Doctoral Dissertation, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro, Brazil.

Herrera, J. E. T., Casanova, M. A., Nunes, B. P., Leme, L. A. P. P., and Lopes, G. R. (2017). An Entity Relatedness Test Dataset. In The Semantic Web – ISWC 2017, volume 10588, pages 193–201. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-68204-4_20.

Herrera, J. E. T., Casanova, M. A., Nunes, B. P., Lopes, G. R., and Leme, L. (2016). DBpedia Profiler Tool: Profiling the Connectivity of Entity Pairs in DBpedia. In Proceedings of the 5th International Workshop on Intelligent Exploration of Semantic Data (IESD 2016).

Hulpuş, I., Prangnawarat, N., and Hayes, C. (2015). PathBased Semantic Relatedness on Linked Data and Its Use to Word and Entity Disambiguation. In The Semantic Web - ISWC 2015, volume 9366, pages 442–457. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-25007-6_26.

Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vaudoise Sci Nat, 37:547–579.

Jeh, G. and Widom, J. (2002). SimRank: a measure of structural-context similarity. In Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 538–543. ACM.

Kacholia, V., Pandit, S., Chakrabarti, S., Sudarshan, S., Desai, R., and Karambelkar, H. (2005). Bidirectional expansion for keyword search on graph databases. In Proceedings of the 31st International Conference on Very Large Data Bases, VLDB ’05, pages 505–516, Trondheim, Norway. VLDB Endowment.

Le, W., Li, F., Kementsietsidis, A., and Duan, S. (2014). Scalable keyword search on large RDF data. Knowledge and Data Engineering, IEEE Transactions on, 26(11):2774–2788.

Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P. N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., and Bizer, C. (2015). DBpedia – A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 6(2):167–195. DOI: 10.3233/SW-140134.

Lehmann, J., Schüppel, J., and Auer, S. (2007). Discovering Unknown Connections – the DBpedia Relationship Finder. In The Social Semantic Web 2007–Proceedings of the 1st Conference on Social Semantic Web (CSSW), pages 99– 109. Gesellschaft für Informatik e. V.

Li, C., Han, J., He, G., Jin, X., Sun, Y., Yu, Y., and Wu, T. (2010). Fast computation of SimRank for static and dynamic information networks. In Proceedings of the 13th International Conference on Extending Database Technology - EDBT ’10, page 465, Lausanne, Switzerland. ACM Press. DOI: 10.1145/1739041.1739098.

Li, M., Choudhury, F. M., Borovica-Gajic, R., Wang, Z., Xin, J., and Li, J. (2020). CrashSim: An Efficient Algorithm for Computing SimRank over Static and Temporal Graphs. In Proceedings of the IEEE 36th International Conference on Data Engineering (ICDE), pages 1141–1152. IEEE. DOI: 10.1109/ICDE48307.2020.00103.

Lizorkin, D. and Velikhov, P. (2008). Accuracy Estimate and Optimization Techniques for SimRank Computation. Proceedings of the VLDB Endowment, 1(1):12. DOI: 10.14778/1453856.1453904.

Markiewicz, M. E. and Lucena, C. J. P. (2001). Object oriented framework development. Crossroads, pages 10–1145.

Meymandpour, R. and Davis, J. G. (2016). A semantic similarity measure for linked data: An information contentbased approach. Knowledge-Based Systems, 109:276–293.

Milne, D. and Witten, I. H. (2008). An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links. In Proceedings of the AAAI 2008 Workshop on Wikipedia and Artificial Intelligence, pages 25–30, Chicago. AAAI Press.

Moore, J. L., Steinke, F., and Tresp, V. (2012). A Novel Metric for Information Retrieval in Semantic Networks. In The Semantic Web: ESWC 2011 Workshops, volume 7117, pages 65–79. Springer, Berlin, Heidelberg. DOI: 10.1007/978-3-642-25953-1_6.

Nunes, B. P., Herrera, J., Taibi, D., Lopes, G. R., Casanova, M. A., and Dietze, S. (2014). SCS Connector - Quantifying and Visualising Semantic Paths Between Entity Pairs. In Proceedings of the Satellite Events of the 11th European Semantic Web Conference (ESWC’14), pages 461–466. DOI: 10.1007/978-3-319-11955-7_67.

Pirrò, G. (2015). Explaining and Suggesting Relatedness in Knowledge Graphs. In The Semantic Web - ISWC 2015, volume 9366, pages 622–639. Springer International Publishing, Cham. DOI: 10.1007/978-3-319-25007-6_36.

Reyhani Hamedani, M. and Kim, S.-W. (2021). On Investigating Both Effectiveness and Efficiency of Embedding Methods in Task of Similarity Computation of Nodes in Graphs. Applied Sciences, 11(1):162. DOI: 10.3390/app11010162.

Schreiber, G. and Raimond, Y. (2014). RDF 1.1 Primer.

Downloads

Published

2022-12-14

How to Cite

Jiménez, J. G., Paes Leme, L. A. P., & Casanova, M. A. (2022). CoEPinKB: Evaluating Path Search Strategies in Knowledge Bases. Journal of the Brazilian Computer Society, 28(1), 13–25. https://doi.org/10.5753/jbcs.2022.2211

Issue

Section

Articles