Mining Portuguese Comparative Sentences in Online Reviews

  • Daniel Kansaon UFMG
  • Michele A. Brandão IFMG
  • Julio C. S. Reis UFMG
  • Matheus Barbosa UFMG
  • Breno Matos UFMG


The constant expansion of e-commerce, recently boosted due to the coronavirus pandemic, has led to a huge increase in online shopping. More and more, customers demand online reviews of products and comments on the Web to make decisions about buying a product over another. In this context, sentiment analysis techniques constitute the traditional way to summarize user's opinions that criticizes or highlights the positive aspects of a product. Sentiment analysis of reviews usually relies on extracting positive and negative aspects of products, neglecting comparative opinions. Such opinions do not directly express a positive or negative view but contrast aspects of products from different competitors. In this paper, we present the first effort towards detecting comparative sentences in Portuguese. Identifying comparative sentences is a key task for companies to know how users are comparing a product with their competitors and is essential for developing sentiment summarization applications for the end user. In addition, we present a supervised approach to automatically detect Portuguese comparative sentences, classifying them into five distinct groups: (1) Non-Comparative, (2) Non-Equal Gradable, (3) Equative, (4) Superlative e (5) Non-Gradable. To that end, this paper provides three main contributions: (1) a Portuguese lexicon list with words used to make comparisons; (2) two new Portuguese datasets with comparative sentences; and (3) a hierarchical approach for detecting multiple comparisons and classify the sentences in different groups by using state-of-art classification algorithms, reaching an accuracy of 87%.
Palavras-chave: opinion mining, comparative opinion, sentiment analysis
