Constructing a KBQA Framework: Design and Implementation

  • Rômulo Chrispim de Mello UFJF
  • Jorão Gomes Jr. WU
  • Jairo Francisco de Souza UFJF
  • Victor Ströele UFJF

Resumo


The exponential growth of data on the internet has made information retrieval increasingly challenging. Knowledge-based Question-Answering (KBQA) framework offers an efficient solution that quickly provides accurate and relevant information. However, these frameworks face significant challenges, especially when dealing with complex queries involving multiple entities and properties. This paper studies KBQA frameworks, focusing on improving entity recognition, property extraction, and query generation using advanced Natural Language Processing (NLP) and Artificial Intelligence (AI) techniques. We implemented and evaluated combination tools for extracting entities and properties, with the combination of models achieving the best performance. Our evaluation metrics included entity and property retrieval, SPARQL query completeness, and accuracy. The results demonstrated the effectiveness of our approach, with high accuracy rates in identifying entities and properties.
Palavras-chave: KBQA, Complex Questions, Entity Recognition, Property Extraction, SPARQL

Referências

Shuang Chen, Qian Liu, Zhiwei Yu, Chin-Yew Lin, Jian-Guang Lou, and Feng Jiang. 2021. ReTraCk: A flexible and efficient framework for knowledge base question answering. In Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing: system demonstrations. 325–336.

Xavier Daull, Patrice Bellot, Emmanuel Bruno, Vincent Martin, and Elisabeth Murisasco. 2023. Complex QA and language models hybrid architectures, Survey. arXiv preprint arXiv:2302.09051 (2023).

Akshay Kumar Dileep, Anurag Mishra, Ria Mehta, Siddharth Uppal, Jaydeep Chakraborty, and Srividya K. Bansal. 2021. Template-based Question Answering analysis on the LC-QuAD2.0 Dataset. In 2021 IEEE 15th International Conference on Semantic Computing (ICSC). 443–448. DOI: 10.1109/ICSC50631.2021.00079

Eleftherios Dimitrakis, Konstantinos Sgontzos, and Yannis Tzitzikas. 2020. A survey on question answering systems over linked data and documents. Journal of intelligent information systems 55 (2020), 233–259.

Mohnish Dubey, Debayan Banerjee, Abdelrahman Abdelkawi, and Jens Lehmann. 2019. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International Semantic Web Conference. Springer, 69–78.

Paolo Ferragina and Ugo Scaiella. 2010. TAGME: On-the-Fly Annotation of Short Text Fragments (by Wikipedia Entities). In Proceedings of the 19th ACM International Conference on Information and Knowledge Management (Toronto, ON, Canada) (CIKM ’10). Association for Computing Machinery, New York, NY, USA, 1625–1628. DOI: 10.1145/1871437.1871689

Jorão Gomes, Rômulo Chrispim de Mello, Victor Ströele, and Jairo Francisco de Souza. 2022. A Hereditary Attentive Template-based Approach for Complex Knowledge Base Question Answering Systems. Expert Systems with Applications 205 (2022), 117725. DOI: 10.1016/j.eswa.2022.117725

Jorão Gomes Jr, Rômulo Chrispim de Mello, Victor Ströele, and Jairo Francisco de Souza. 2022. A study of approaches to answering complex questions over knowledge bases. Knowledge and Information Systems 64, 11 (2022), 2849–2881.

Jorão Gomes Jr., Rômulo Chrispim de Mello, Victor Ströele, and Jairo Francisco de Souza. 2021. LC-QuAD 2.1. DOI: 10.5281/zenodo.5508297

Xixin Hu, Xuan Wu, Yiheng Shu, and Yuzhong Qu. 2022. Logical Form Generation via Multi-task Learning for Complex Question Answering over Knowledge Bases. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 1687–1696. [link]

Heewon Jang, Yeongtaek Oh, Seunghee Jin, Haemin Jung, Hyesoo Kong, Dokyung Lee, Dongkyu Jeon, and Wooju Kim. 2017. KBQA: Constructing Structured Query Graph from Keyword Query for Semantic Search. In Proceedings of the International Conference on Electronic Commerce (Pangyo, Seongnam, Republic of Korea) (ICEC ’17). Association for Computing Machinery, New York, NY, USA, Article 8, 8 pages. DOI: 10.1145/3154943.3154955

Yunshi Lan, Gaole He, Jinhao Jiang, Jing Jiang, Wayne Xin Zhao, and Ji-Rong Wen. 2021. A Survey on Complex Knowledge Base Question Answering: Methods, Challenges and Solutions. CoRR abs/2105.11644 (2021). arXiv:2105.11644 [link]

Nandana Mihindukulasooriya, Gaetano Rossiello, Pavan Kapanipathi, Ibrahim Abdelaziz, Srinivas Ravishankar, Mo Yu, Alfio Gliozzo, Salim Roukos, and Alexander G. Gray. 2020. Leveraging Semantic Parsing for Relation Linking over Knowledge Bases. CoRR abs/2009.07726 (2020). arXiv:2009.07726 [link]

Saeedeh Momtazi and Zahra Abbasiantaeb. 2022. Question Answering over Text and Knowledge Base. Springer Nature.

Sumit Neelam, Udit Sharma, Hima Karanam, Shajith Ikbal, Pavan Kapanipathi, Ibrahim Abdelaziz, Nandana Mihindukulasooriya, Young-Suk Lee, Santosh Srivastava, Cezar Pendus, et al. 2022. A benchmark for generalizable and interpretable temporal question answering over knowledge bases. arXiv preprint arXiv:2201.05793 (2022).

Ngonga Ngomo. 2018. 9th challenge on question answering over linked data (QALD-9). language 7, 1 (2018), 58–64.

Kechen Qin, Yu Wang, Cheng Li, Kalpa Gunaratna, Hongxia Jin, Virgil Pavlu, and Javed A. Aslam. 2020. A Complex KBQA System using Multiple Reasoning Paths. CoRR abs/2005.10970 (2020). arXiv:2005.10970 [link]

Gaetano Rossiello, Nandana Mihindukulasooriya, Ibrahim Abdelaziz, Mihaela Bornea, Alfio Gliozzo, Tahira Naseem, and Pavan Kapanipathi. 2021. Generative Relation Linking for Question Answering over Knowledge Bases. In The Semantic Web – ISWC 2021, Andreas Hotho, Eva Blomqvist, Stefan Dietze, Achille Fokoue, Ying Ding, Payam Barnaghi, Armin Haller, Mauro Dragoni, and Harith Alani(Eds.). Springer International Publishing, Cham, 321–337.

Ahmad Sakor, Kuldeep Singh, Anery Patel, and Maria-Esther Vidal. 2019. FALCON 2.0: An Entity and Relation Linking Tool over Wikidata. CoRR abs/1912.11270 (2019). arXiv:1912.11270 [link]

Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, and Guilin Qi. 2023. Evaluation of ChatGPT as a Question Answering System for Answering Complex Questions. arXiv preprint arXiv:2303.07992 (2023).

Priyansh Trivedi, Gaurav Maheshwari, Mohnish Dubey, and Jens Lehmann. 2017. Lc-quad: A corpus for complex question answering over knowledge graphs. In International Semantic Web Conference. Springer, Springer International Publishing, Cham, 210–218.

Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, Bastian Haarmann, Anastasia Krithara, Michael Röder, and Giulio Napolitano. 2017. 7th open challenge on question answering over linked data (QALD-7). In Semantic Web Challenges: 4th SemWebEval Challenge at ESWC 2017, Portoroz, Slovenia, May 28-June 1, 2017, Revised Selected Papers. Springer, 59–69.

Zhiwen Xie, Zhao Zeng, Guangyou Zhou, and Tingting He. 2016. Knowledge base question answering based on deep learning models. In Natural Language Understanding and Intelligent Applications: 5th CCF Conference on Natural Language Processing and Chinese Computing, NLPCC 2016, and 24th International Conference on Computer Processing of Oriental Languages, ICCPOL 2016, Kunming, China, December 2–6, 2016, Proceedings 24. Springer, 300–311.

Xi Ye, Semih Yavuz, Kazuma Hashimoto, Yingbo Zhou, and Caiming Xiong. 2021. Rng-kbqa: Generation augmented iterative ranking for knowledge base question answering. arXiv preprint arXiv:2109.08678 (2021).
Publicado
14/10/2024
MELLO, Rômulo Chrispim de; GOMES JR., Jorão; SOUZA, Jairo Francisco de; STRÖELE, Victor. Constructing a KBQA Framework: Design and Implementation. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 30. , 2024, Juiz de Fora/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 89-97. DOI: https://doi.org/10.5753/webmedia.2024.243150.