A Methodology to Handle Social Media Posts in Brazilian Portuguese for Text Mining Applications

  • Milton Stiilpen Junior UFOP
  • Luiz Henrique de Campos Merschmann UFOP

Abstract


Online Social Networks emerged at the beginning of 21st century and give us evidence that they are going to have a long life. Almost two-thirds of overall social media users affirm an everyday usage of a social media website and, therefore, the data volume across this platforms is huge. Natural language processing of social media texts is an attractive topic among researchers of this area. While there are many studies about natural language processing of social media texts for some languages (e.g., English), the researches for Brazilian Portuguese language are still limited. Then, in this paper, a methodology is proposed to deal with peculiarities of the Brazilian Portuguese language in informal, short and noisy texts, where the lack of context poses obstacles in text mining. The proposed methodology has been evaluated in two tasks (Text Categorization and Opinion Mining) and experiments showed that the preprocessing mechanisms included in this methodology were important to achieve better results.
Keywords: Text Mining, Online Social Networks, Natural Language Processing
Published
2016-11-08
STIILPEN JUNIOR, Milton ; MERSCHMANN, Luiz Henrique de Campos. A Methodology to Handle Social Media Posts in Brazilian Portuguese for Text Mining Applications. In: BRAZILIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB (WEBMEDIA), 22. , 2016, Teresina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2016 . p. 239-246.