A Methodology to Handle Social Media Posts in Brazilian Portuguese for Text Mining Applications

  • Milton Stiilpen Junior UFOP
  • Luiz Henrique de Campos Merschmann UFOP

Resumo


Online Social Networks emerged at the beginning of 21st century and give us evidence that they are going to have a long life. Almost two-thirds of overall social media users affirm an everyday usage of a social media website and, therefore, the data volume across this platforms is huge. Natural language processing of social media texts is an attractive topic among researchers of this area. While there are many studies about natural language processing of social media texts for some languages (e.g., English), the researches for Brazilian Portuguese language are still limited. Then, in this paper, a methodology is proposed to deal with peculiarities of the Brazilian Portuguese language in informal, short and noisy texts, where the lack of context poses obstacles in text mining. The proposed methodology has been evaluated in two tasks (Text Categorization and Opinion Mining) and experiments showed that the preprocessing mechanisms included in this methodology were important to achieve better results.
Publicado
08/11/2016
Como Citar

Selecione um Formato
STIILPEN JUNIOR, Milton ; MERSCHMANN, Luiz Henrique de Campos. A Methodology to Handle Social Media Posts in Brazilian Portuguese for Text Mining Applications. In: SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 22. , 2016, Teresina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2016 . p. 239-246.