Mecanismo de Busca Semântica Baseado em Word Embeddings em Dados do Currículo Lattes, Programas de Pós-Graduação e Grupos de Pesquisa
Resumo
A busca por pesquisadores e publicações científicas é fundamental para o acesso ao conhecimento acadêmico. No entanto, os mecanismos de busca baseados em correspondência de palavras-chave podem ignorar a semântica das consultas, o que pode resultar em respostas pouco relevantes. Esta pesquisa propõe a implementação e análise de um mecanismo de busca semântica, utilizando Word Embeddings para fornecer respostas mais relevantes no contexto acadêmico. O estudo apresenta uma arquitetura e implementação que permite a realização de buscas semânticas em bases de dados científicas de maneira eficiente, por meio da transformação e indexação de Word Embeddings.Referências
Deepak, G. and Santhanavijayan, A. (2022). Uqscm-rfd: A query–knowledge interfacing approach for diversified query recommendation in semantic search based on river flow dynamics and dynamic user interaction. Neural Computing and Applications, 34(1):651–675.
dos Santos, M. S., de Jesus Oliveira, V. H., de Freitas Jorge, E. M., and de Meireles Costa, G. (2024). Solução para mapeamento e consulta das competências dos pesquisadores: uma arquitetura para extração, integração e consultas de informações acadêmicas. Cadernos de Prospecção, 17(2):671–688.
Dresch, A., Lacerda, D. P., and Junior, J. A. V. A. (2020). Design science research: método de pesquisa para avanço da ciência e tecnologia. Bookman Editora.
Farmanbar, M., Van Ommeren, N., and Zhao, B. (2020). Semantic search with domain-specific word-embedding and production monitoring in fintech. In Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations, pages 28–33.
Forgues, G., Pineau, J., Larchevêque, J.-M., and Tremblay, R. (2014). Bootstrapping dialog systems with word embeddings. In Nips, modern machine learning and natural language processing workshop, volume 2, page 168.
Gundyreva, E., Pivovarova, L., and Zosa, E. (2022). Unsupervised linking of scientific articles to food systems taxonomies.
Gupta, S. (2017). A survey on search engines. Journal for Research— Volume, 2(11).
Jbene, M., Tigani, S., Saadane, R., and Chehri, A. (2021). Deep neural network and boosting based hybrid quality ranking for e-commerce product search. Big Data and Cognitive Computing, 5(3):35.
Rastogi, N., Verma, P., and Kumar, P. (2021). Query expansion based on word embeddings and ontologies for efficient information retrieval. International Journal of Advanced Computer Science and Applications, 12(11).
Sharma, A. and Kumar, S. (2022). Shallow neural network and ontology-based novel semantic document indexing for information retrieval. Intelligent Automation & Soft Computing, 34(3):1989–2005.
Sharma, D. K., Pamula, R., and Chauhan, D. (2021). Semantic approaches for query expansion. Evolutionary Intelligence, 14(2):1101–1116.
Sheela, A. S. and Jayakumar, C. (2019). Comparative study of syntactic search engine and semantic search engine: A survey. In 2019 Fifth International Conference on Science Technology Engineering and Mathematics (ICONSTEM), volume 1, pages 1–4.
Ta, C. V., Reiner, F., von Detten, I., and Stöhr, F. (2022). Touché-task 1-team korg: Finding pairs of argumentative sentences using embeddings. In CLEF (Working Notes), pages 3131–3148.
Tuncer, I., Kara, K. C., and Karakaş, A. (2021). Improving search relevance with word embedding based clusters. In Trends in Data Engineering Methods for Intelligent Systems: Proceedings of the International Conference on Artificial Intelligence and Applied Mathematics in Engineering (ICAIAME 2020), pages 15–24. Springer.
dos Santos, M. S., de Jesus Oliveira, V. H., de Freitas Jorge, E. M., and de Meireles Costa, G. (2024). Solução para mapeamento e consulta das competências dos pesquisadores: uma arquitetura para extração, integração e consultas de informações acadêmicas. Cadernos de Prospecção, 17(2):671–688.
Dresch, A., Lacerda, D. P., and Junior, J. A. V. A. (2020). Design science research: método de pesquisa para avanço da ciência e tecnologia. Bookman Editora.
Farmanbar, M., Van Ommeren, N., and Zhao, B. (2020). Semantic search with domain-specific word-embedding and production monitoring in fintech. In Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations, pages 28–33.
Forgues, G., Pineau, J., Larchevêque, J.-M., and Tremblay, R. (2014). Bootstrapping dialog systems with word embeddings. In Nips, modern machine learning and natural language processing workshop, volume 2, page 168.
Gundyreva, E., Pivovarova, L., and Zosa, E. (2022). Unsupervised linking of scientific articles to food systems taxonomies.
Gupta, S. (2017). A survey on search engines. Journal for Research— Volume, 2(11).
Jbene, M., Tigani, S., Saadane, R., and Chehri, A. (2021). Deep neural network and boosting based hybrid quality ranking for e-commerce product search. Big Data and Cognitive Computing, 5(3):35.
Rastogi, N., Verma, P., and Kumar, P. (2021). Query expansion based on word embeddings and ontologies for efficient information retrieval. International Journal of Advanced Computer Science and Applications, 12(11).
Sharma, A. and Kumar, S. (2022). Shallow neural network and ontology-based novel semantic document indexing for information retrieval. Intelligent Automation & Soft Computing, 34(3):1989–2005.
Sharma, D. K., Pamula, R., and Chauhan, D. (2021). Semantic approaches for query expansion. Evolutionary Intelligence, 14(2):1101–1116.
Sheela, A. S. and Jayakumar, C. (2019). Comparative study of syntactic search engine and semantic search engine: A survey. In 2019 Fifth International Conference on Science Technology Engineering and Mathematics (ICONSTEM), volume 1, pages 1–4.
Ta, C. V., Reiner, F., von Detten, I., and Stöhr, F. (2022). Touché-task 1-team korg: Finding pairs of argumentative sentences using embeddings. In CLEF (Working Notes), pages 3131–3148.
Tuncer, I., Kara, K. C., and Karakaş, A. (2021). Improving search relevance with word embedding based clusters. In Trends in Data Engineering Methods for Intelligent Systems: Proceedings of the International Conference on Artificial Intelligence and Applied Mathematics in Engineering (ICAIAME 2020), pages 15–24. Springer.
Publicado
05/11/2024
Como Citar
BATISTA, João Vítor Café dos R.; COSTA, Gleidson de Meireles; JORGE, Eduardo Manuel de Freitas.
Mecanismo de Busca Semântica Baseado em Word Embeddings em Dados do Currículo Lattes, Programas de Pós-Graduação e Grupos de Pesquisa. In: ESCOLA REGIONAL DE COMPUTAÇÃO BAHIA, ALAGOAS E SERGIPE (ERBASE), 24. , 2024, Salvador/BA.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 109-118.
DOI: https://doi.org/10.5753/erbase.2024.4430.
