A Majority Voting Approach for Sentiment Analysis in Short Texts using Topic Models

  • Rodrigo Rodrigues do Carmo CEFET-MG
  • Anísio Mendes Lacerda CEFET-MG
  • Daniel Hasan Dalip CEFET-MG

Resumo


Nowadays people can provide feedback on products and services on the web. Site owners can use this kind of information in order to understand more their public preferences. Sentiment Analysis can help in this task, providing methods to infer the polarity of the reviews. In these methods, the classifier can use hints about the polarity of the words and the subject being discussed in order to infer the polarity of the text. However, many of these texts are short and, because of that, the classifier can have difficulties to infer these hints. We here propose a new sentiment analysis method that uses topic models to infer the polarity of short texts. The intuition of this approach is that, by using topics, the classifier is able to better understand the context and improve the performance in this task. In this approach, we first use methods to infer topics such as LDA, BTM and MedLDA in order to represent the review and, then, we apply a classifier (e.g. Linear SVM, Random Forest or Logistic Regression). In this method, we combine the results of classifiers and text representations in two ways: (1) by using single topic representation and multiple classifiers; (2) and using multiple topic representations and a single classifier. We also analyzed the impact of expanding these texts since the topic model methods can have difficulties to deal with the data sparsity present in these reviews. The proposed approach could achieve gains of up to 8.5% compared to our baseline. Moreover, we were able to determine the best classifier (Random Forest) and the best topic detection method (MedLDA).
Publicado
17/10/2017
Como Citar

Selecione um Formato
CARMO, Rodrigo Rodrigues do; LACERDA, Anísio Mendes; DALIP, Daniel Hasan. A Majority Voting Approach for Sentiment Analysis in Short Texts using Topic Models. In: SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 23. , 2017, Gramado. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2017 . p. 449-455.