SteamBR: a dataset for game reviews and evaluation of a state-of-the-art method for helpfulness prediction

  • Germano A. Z. Jorge USP
  • Thiago A. S. Pardo USP


The digital revolution has led to exponential growth in user-generated content, including ratings and reviews, across numerous online platforms. One such platform is Steam, a multifaceted digital distribution network primarily for video games, that also functions as an active social network. Like many e-commerce, travel, and restaurant platforms, Steam users rely heavily on reviews to inform their purchasing decisions. However, the vast amount of data and varying quality of reviews may hinder the utility of such reviews. Furthermore, there is a significant challenge in assessing the helpfulness of recent or less-voted reviews. This study proposes a method for automating review helpfulness evaluation, focusing particularly on Brazilian Portuguese game reviews. The research involved the collection of a large dataset, including 2,789,893 reviews from over 12,000 games, creating a novel dataset for game reviews. Using feature extraction techniques, we were able to capture the metadata, semantic elements, and distributional characteristics present in the reviews. Subsequently, Machine Learning algorithms were employed to perform classification and regression tasks, with the objective of discerning helpful from unhelpful reviews. The achieved results demonstrated that the method was highly effective in predicting review helpfulness.


JORGE, Germano A. Z.; PARDO, Thiago A. S.. SteamBR: a dataset for game reviews and evaluation of a state-of-the-art method for helpfulness prediction. In: BRAZILIAN WORKSHOP ON SOCIAL NETWORK ANALYSIS AND MINING (BRASNAM), 12. , 2023, João Pessoa/PB. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 210-215. ISSN 2595-6094. DOI:
