Query Answer Reformulation over Knowledge Bases
Keywords:Aggregation, Summarization, Natural Language Query (NLQ), Question Answering (QA), RDF, Semantic Web
The answer of a query, submitted to a database or a knowledge base, is often long and may contain redundant data. The user is frequently forced to browse through a long answer or refine and repeat the query until the answer reaches a manageable size. Without proper treatment, consuming the answer may indeed become a tedious task. This article then proposes a process that modifies the presentation of a query answer to improve the quality of the user’s experience in the context of an RDF knowledge base. The process reorganizes the original query answer by applying heuristics to summarize the results and to select template questions that create a user dialog that guides the presentation of the results. The article also includes experiments based on RDF versions of MusicBrainz, enriched with DBpedia data, and IMDb, each with over 200 million RDF triples. The experiments use sample queries from well-known benchmarks.
Coffman, J. and Weaver, A. C. A framework for evaluating database keyword search strategies. In Proceedings of the 19th ACM international conference on Information and knowledge management. pp. 729–738, 2010.
Cyganiak, R., Wood, D., and Lanthaler, M. RDF 1.1 Concepts and Abstract Syntax, 2014. W3C Recommendation 25 February 2014.
Dalianis, H. and Hovy, E. Aggregation in natural language generation. In Trends in Natural Language Generation An Artificial Intelligence Perspective, J. G. Carbonell, J. Siekmann, G. Goos, J. Hartmanis, J. Leeuwen, G. Adorni, and M. Zock (Eds.). Vol. 1036. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 88–105, 1996.
Diefenbach, D., Lopez, V., Singh, K., and Maret, P. Core techniques of question answering systems over knowledge bases: a survey. Knowledge and Information Systems 55 (3): 529–569, June, 2018.
Diefenbach, D., Tanon, T. P., Singh, K., and Maret, P. Question Answering Benchmarks for Wikidata. In ISWC 2017. Vienne, Austria, 2017.
Franz, T., Schultz, A., Sizov, S., and Staab, S. TripleRank: Ranking semantic web data by tensor decomposition. In Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, pp. 213–228, 2009.
Menendez, E. S., Casanova, M. A., Leme, L. A. P., and Boughanem, M. Novel node importance measures to improve keyword search over rdf graphs. In International Conference on Database and Expert Systems Applications. Springer, pp. 143–158, 2019.
Moreno-Vega, J. and Hogan, A. Grafa: Faceted search & browsing for the wikidata knowledge graph. In International Semantic Web Conference. Springer, Cham, 2018.
Novello, A. and Casanova, M. A. A novel solution for the aggregation problem in natural language interface to databases (nlidb). In Anais do XXXV Simpósio Brasileiro de Bancos de Dados. SBC, Porto Alegre, RS, Brasil, pp. 217–222, 2020.
Petzka, H., Stadler, C., Katsimpras, G., Haarmann, B., and Lehmann, J. Benchmarking faceted browsing capabilities of triplestores. In Proceedings of the 13th International Conference on Semantic Systems. Semantics2017. Association for Computing Machinery, New York, NY, USA, pp. 128–135, 2017.
Prud’hommeaux, E. and Seaborne, A. SPARQL Query Language for RDF, 2008. W3C Recommendation 15 January 2008.
Webber, B. L. Questions, answers and responses: Interacting with knowledge-base systems. In Topics in Information Systems. Springer New York, pp. 365–402, 1986.
Wei, B., Liu, J., Zheng, Q., Zhang, W., Fu, X., and Feng, B. A survey of faceted search. Journal of Web Engineering vol. 12, pp. 41–64, 02, 2013.