Generating Entity Representation from Online Discussions - Challenges and an Evaluation Framework

  • Túlio C. Loures UFMG
  • Pedro O. S. Vaz de Melo UFMG
  • Adriano A. Veloso UFMG

Resumo


Because of the ubiquitous use of the Internet in current society, it is easy to find groups or communities of people discussing about the most varied subjects. Learning about these subjects (or entities) from such discussions is of great interest for companies, organizations, public figures (e.g. politicians) and researchers alike. In this paper, we explore the problem of learning entity representations using online discussions about them as the only source of information. While such discussions may reveal relevant and surprising information about the corresponding subjects, they may also be completely irrelevant. As another challenge, while regular text documents usually contain a well structured language, online discussions often contain informal and mispelled words. Here we formally define the problem, propose a new benchmark for evaluating vector representation methods, and perform a deep evaluation of well-known techniques using three proposed evaluation scenarios: (i) clustering, (ii) ordering and (iii) recommendation. Results show that each method is better than at least one other in some evaluation
Publicado
17/10/2017
Como Citar

Selecione um Formato
LOURES, Túlio C.; MELO, Pedro O. S. Vaz de; VELOSO, Adriano A.. Generating Entity Representation from Online Discussions - Challenges and an Evaluation Framework. In: SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA) , 2017, Gramado. Anais do XXIII Simpósio Brasileiro de Sistemas Multimídia e Web. Porto Alegre: Sociedade Brasileira de Computação, oct. 2017 . p. 197-204.