AER-MinT - Support for the Process of Extraction of Relationships Based on Textual Data Mining
Abstract
The growth of unstructured data on the Web provides some services. One of them is the acquisition of knowledge that the information extraction process is able to offer. To this end, dataset enrichment approaches began to use unstructured data, adopting machine learning algorithms in order to increase its effectiveness. However, there is a lack of support instruments and there is a low supply of datasets. Thus, this article proposes AER-MinT, an approach capable of applying a training model from a corpus of texts, using BERT and a Convolutional Neural Network, in order to support the extraction of relations in text sentences. As a result, exploration through an RDF graph is possible.
Keywords:
Artificial Intelligence, RDF Graph, Natural Language Processing, Relationship Extraction
References
Avelino, J., Cordeiro, K., and Cavalcanti, M. C. (2020). An RDF Based Approach for Integrating Data at Different Levels of Abstraction. WebMedia’20, page 81-88.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the NAACL, pages 4171-4186. Association for Computational Linguistics.
Géron, A. (2019). Mãos à Obra: Aprendizado de Máquina com Scikit-Learn TensorFlow.
Guarino, N. (1995). The Ontological Level, pages 443-456. Holder-Pivhler-Tempsky.
Miwa, M. and Bansal, M. (2016). End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 1105-1116.
Sherif, M. A., Ngomo, A.-C. N., et al. (2015). Automating rdf dataset transformation and enrichment. In The Semantic Web. Latest Advances and New Domains, pages 371-387.
Silveira, R. and Cavalcanti, M. (2020). Método para rotular ligações semânticas na web de dados. In Anais do XXXV Simpósio Brasileiro de Bancos de Dados, pages 49-60.
Silveira, R. and Cavalcanti, M. (2021). Método para Rotular Ligações Semânticas na Web de Dados. Mestrado em Sistemas e Computação, IME.
Teixeira, K. T., Campos, M. L. M., et al. (2018). Extração de dados de fontes textuais: Uma abordagem para enriquecimento de dados abertos interligados. In SEMISH. SBC.
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is All you Need. In Advances in Neural Information Processing Systems, pages 5998-6008.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the NAACL, pages 4171-4186. Association for Computational Linguistics.
Géron, A. (2019). Mãos à Obra: Aprendizado de Máquina com Scikit-Learn TensorFlow.
Guarino, N. (1995). The Ontological Level, pages 443-456. Holder-Pivhler-Tempsky.
Miwa, M. and Bansal, M. (2016). End-to-end relation extraction using LSTMs on sequences and tree structures. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pages 1105-1116.
Sherif, M. A., Ngomo, A.-C. N., et al. (2015). Automating rdf dataset transformation and enrichment. In The Semantic Web. Latest Advances and New Domains, pages 371-387.
Silveira, R. and Cavalcanti, M. (2020). Método para rotular ligações semânticas na web de dados. In Anais do XXXV Simpósio Brasileiro de Bancos de Dados, pages 49-60.
Silveira, R. and Cavalcanti, M. (2021). Método para Rotular Ligações Semânticas na Web de Dados. Mestrado em Sistemas e Computação, IME.
Teixeira, K. T., Campos, M. L. M., et al. (2018). Extração de dados de fontes textuais: Uma abordagem para enriquecimento de dados abertos interligados. In SEMISH. SBC.
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is All you Need. In Advances in Neural Information Processing Systems, pages 5998-6008.
Published
2022-09-19
How to Cite
AVELINO, Jones O.; CORDEIRO, Kelli F.; C. CAVALCANTI, Maria.
AER-MinT - Support for the Process of Extraction of Relationships Based on Textual Data Mining. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 37. , 2022, Búzios.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2022
.
p. 409-414.
ISSN 2763-8979.
DOI: https://doi.org/10.5753/sbbd.2022.226201.
