Semantic Structuring of E-commerce Texts: The QART Framework

Resumo


The challenge of transforming natural language texts into structured knowledge representations is important to enhance data integration in e-commerce. We developed the QART framework to address this challenge of converting e-commerce questions and answers into RDF triples for integration into existing Knowledge Graphs (KGs). The QART framework consists of four main steps: field selection and pre-processing, text-to-text conversion, text triplifying, and RDF triple curation. These steps aim to manage the volume and complexity of e-commerce data while ensuring semantic correctness and consistency with predefined ontologies. Our evaluations demonstrated that intermediary steps, such as text summarization, produce competitive results and can improve the quality of the resulting triples.
Palavras-chave: Natural Language Processing, E-commerce, Knowledge Graphs, RDF Triples

Referências

Akter, Y. A. and Rahman, M. A. (2019). Extracting rdf triples from raw text. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), pages 1–4. IEEE.

Augenstein, I., Padó, S., and Rudolph, S. (2012). Lodifier: Generating linked data from unstructured text. In Extended Semantic Web Conference, pages 210–224. Springer.

Gangemi, A., Presutti, V., Reforgiato Recupero, D., Nuzzolese, A. G., Draicchio, F., and Mongiovı̀, M. (2017). Semantic web machine reading with fred. Semantic Web, 8(6):873–893.

Liu, Y., Zhang, T., Liang, Z., Ji, H., and McGuinness, D. L. (2018). Seq2rdf: An end-to-end application for deriving triples from natural language text. In CEUR Workshop Proceedings, volume 2180. CEUR-WS.

Martinez-Rodriguez, J. L., Lopez-Arevalo, I., Rios-Alvarado, A. B., Hernandez, J., and Aldana-Bobadilla, E. (2019). Extraction of rdf statements from text. In Iberoamerican Knowledge Graphs and Semantic Web Conference, pages 87–101. Springer.

Regino, A. G., Caus, R. O., Hochgreb, V., and dos Reis, J. C. (2022a). Knowledge graph-based product recommendations on e-commerce platforms. In Aveiro, D., Dietz, J. L. G., and Filipe, J., editors, Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2022, Volume 2: KEOD, Valletta, Malta, October 24-26, 2022, pages 32–42. SCITEPRESS.

Regino, A. G., Caus, R. O., Hochgreb, V., and dos Reis, J. C. (2022b). QART: A framework to transform natural language questions and answers into RDF triples. In Aveiro, D., Dietz, J. L. G., and Filipe, J., editors, Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2022, Volume 2: KEOD, Valletta, Malta, October 24-26, 2022, pages 55–65. SCITEPRESS.

Regino, A. G., Caus, R. O., Hochgreb, V., and dos Reis, J. C. (2023a). From natural language texts to rdf triples: A novel approach to generating e-commerce knowledge graphs. In Coenen, F., Fred, A., Aveiro, D., Dietz, J., Bernardino, J., Masciari, E., and Filipe, J., editors, Knowledge Discovery, Knowledge Engineering and Knowledge Management, pages 149–174. Communications in Computer and Information Science.

Regino, A. G., Caus, R. O., Hochgreb, V., and Reis, J. C. d. (2023b). Leveraging knowledge graphs for e-commerce product recommendations. SN Computer Science, 4(5):689.

Regino, A. G. and dos Reis, J. C. (2024). Generating e-commerce related knowledge graph from text: Open challenges and early results using llms. In TEXT2KG @ ESWC (accepted for publication).

Rossanez, A. and dos Reis, J. C. (2019). Generating knowledge graphs from scientific literature of degenerative diseases. In SEPDA@ ISWC, pages 12–23.

Wang, X., Chen, L., Ban, T., Usman, M., Guan, Y., Liu, S., Wu, T., and Chen, H. (2021). Knowledge graph quality control: A survey. Fundamental Research, 1(5):607–626.
Publicado
14/10/2024
REGINO, André Gomes; DOS REIS, Julio Cesar. Semantic Structuring of E-commerce Texts: The QART Framework. In: WORKSHOP DE TESES E DISSERTAÇÕES (WTDBD) - SIMPÓSIO BRASILEIRO DE BANCO DE DADOS (SBBD), 39. , 2024, Florianópolis/SC. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 144-150. DOI: https://doi.org/10.5753/sbbd_estendido.2024.243761.