An Entity Resolution Approach Based on Word Embeddings and Knowledge Bases for Microblog Texts

  • Luan Souza UFOP
  • Anderson Ferreira UFOP


In the context of information systems in data management, several proposals for entity resolution usually perform on structured data or on long texts that contains contextual information. In short texts, such as microblogs, the lack of context may complicate the disambiguation of named entities mentioned in these texts. On the other hand, word embeddings have been demonstrated as promising techniques for enriching contextual information or being used on similarity estimations. Thus, in this work, we propose an approach for disambiguating named entities gathered from short texts, linking them to documents in a knowledge base using word embeddings and three strategies to find the correct document. Strategy 1 is based on other entity names in the short text. Strategy 2 exploits categories in candidate documents to be linked to the names. And Strategy 3 is based on similarity between documents associated to other named entities from the text and the candidate documents to be linked to the target named entity. In our experimental evaluation, our proposed approach outperforms other approaches usually used in the entity resolution task.
Palavras-chave: entity resolution, word embedding, named entity, information system


