Investigating Opinion Mining through Language Varieties: a Case Study of Brazilian and European Portuguese tweets

  • Douglas Vitório UFRPE
  • Ellen Souza UFRPE / UFPE
  • Ingryd Teles UFRPE / UPE
  • Adriano L. I. Oliveira UFPE

Resumo


Portuguese is a pluricentric language comprising variants that differ from each other in different linguistic levels. It is generally agreed that applying text mining resources developed for one specific variant may produce a different result in another variant, but very little research has been done to measure this difference. This study presents an analysis of opinion mining application when dealing with the two main Portuguese language variants: Brazilian and European. According to the experiments, it was observed that the differences between the Portuguese variants reflect on the application results. The use of a variant for training and another for testing brings a substantial performance drop, but the separation of the variants may not be recommended.

Publicado
02/10/2017
Como Citar

Selecione um Formato
VITÓRIO, Douglas; SOUZA, Ellen; TELES, Ingryd; OLIVEIRA, Adriano L. I.. Investigating Opinion Mining through Language Varieties: a Case Study of Brazilian and European Portuguese tweets. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 1. , 2017, Uberlândia/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2017 . p. 43-52.