MSL-DE - A Lexical Substitution Method based on Dictionaries and Embeddings.

  • Isaias Frederick Januario UFOP
  • Álvaro R. Pereira Jr. UFOP

Resumo


Lexical Substitution has a noticeable evolution in the literature, mainly in the data sources used for the generation of substitutes that feed the process. Of course, dictionaries and thesauri are widely used for grouping synonyms in their structure, but the polysemy of words prevents a direct exchange of terms without analyzing the context. Vector space models, such as embeddings, are used to represent substitutes and also contexts. However, the representation of words considering only contextual factors, in many cases, may incur the approximation of terms that are not exactly synonyms. The characteristics mentioned above suggest that the simultaneous use of dictionaries and embeddings is a promising alternative for the process. Thus, we present a method using information contained in merged dictionaries, in addition to their linguistic relations structured in taxonomies. The method measures the preservation of the meaning of the sentence with the potential synonym by observing its frequency of application in small contexts. In addition, we also consider a complete context to generate input from vector operations highlighting the best synonyms in a previously selected set. The results show the efficiency of the method, surpassing many methods consolidated in the literature in the prediction of the best substitute for words contained in instances of a known benchmark.
Palavras-chave: Lexical Substitution, Embeddings, Wordnet
Publicado
30/11/2020
JANUARIO, Isaias Frederick; PEREIRA JR., Álvaro R.. MSL-DE - A Lexical Substitution Method based on Dictionaries and Embeddings.. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 1. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 269-276.