A Graph-Based Method for Predicting the Helpfulness of Product Opinions





Natural Language Processing, Helpfulness Prediction, Opinion Mining


This paper presents a new approach to predict the helpfulness of opinions. Usually, researchers in this area use tables of attribute-value to aggregate the features that represent the evaluated texts. Although that representation is common, it considers that the objects are independent. We argue that among the discriminant factors of the helpfulness of opinions, there are dependent factors of the relationship among the opinion-forming elements. Thus, we modeled this task as a network, considering the information of relations among objects in the network (comments, stars, and words). A regularization technique of graphs is used to extract the relevant features of graph structure and, after that, the comments are classified as helpful or unhelpful. We compared our network model with two baselines methods, one based on fuzzy logic and another based on Neural Networks. Our model outperformed the fuzzy logic and Neutal Network methods in 0.17 and 0.19 of F-measure, respectively. The main advantages of our approach are that few data are necessary to helpfulness classification and the relationships may help in the understanding the classification, explaining the reasons for a determinate classification.


Download data is not yet available.


Anchiêta, R., Sousa, R. F., Moura, R., and Pardo, T. (2017). Improving opinion summarization by assessing sentence importance in online reviews. In Proceedings of the 11th Brazilian Symposium in Information and Human Language Technology, pages 32–36.

Anchiêta, R. T. and Moura, R. S. (2017). Exploring unsupervised learning towards extractive summarization of user reviews. In Proceedings of the 23rd Brazillian Symposiumon Multimedia and the Web, pages 217–220. ACM.

Barbosa, J. L. and Moura, R. S. (2016). Avaliacãoo automática da utilidade de reviewsusando redes neurais artificiais no corpus do steam. In Anais do XXVI Congresso daSociedade Brasileira de Computação: BraSNAM - 5o Brazilian Workshop on SocialNetwork Analysis and Mining. Brazilian Computer Society.

Bertaglia, T. F. C. and Nunes, M. d. G. V. (2016). Exploring word embeddings for unsupervised textual user-generated content normalization. InProceedings of the 2nd Workshop on Noisy User-generated Text (WNUT), pages 112–120.

Bui, T. D., Ravi, S., and Ramavajjala, V. (2018). Neural graph learning: Training neural networks using graphs. In Proceedings of 11th ACM International Conference on WebSearch and Data Mining (WSDM).

de Sousa, R. F., Rabêlo, R. A., and Moura, R. S. (2015). A fuzzy system-based approach to estimate the importance of online customer reviews. In 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pages 1–8. IEEE.

Diaz, G. O. and Ng, V. (2018). Modeling and prediction of online product review helpfulness: A survey. InProceedings of the 56th Annual Meeting of the Association forComputational Linguistics (Volume 1: Long Papers), volume 1, pages 698–708.

Fonseca, E. R. and Rosa, J. L. G. (2013). Mac-morpho revisited: Towards robust part-of-speech tagging. InProceedings of the 9th Brazilian symposium in information andhuman language technology, pages 98–107.

Hartmann, N. S., Avanço, L. V., Balage Filho, P. P., Duran, M. S., Nunes, M. D. G. V.,Pardo, T. A. S., Aluisio, S. M., et al. (2014). A large corpus of product reviews in portuguese: Tackling out-of-vocabulary words. In International Conference on LanguageResources and Evaluation. European Language Resources Association-ELRA.

Ji, M., Sun, Y., Danilevsky, M., Han, J., and Gao, J. (2010). Graph regularized transductive classification on heterogeneous information networks. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 570–586.Springer.

Kim, S.-M., Pantel, P., Chklovski, T., and Pennacchiotti, M. (2006). Automatically assessing review helpfulness. In Proceedings of the 2006 Conference on empirical methods in natural language processing, pages 423–430. Association for Computational Linguistics.

Krishnamoorthy, S. (2015). Linguistic features for review helpfulness prediction. Expert Systems with Applications, 42(7):3751–3759.

Landauer, T. K., Foltz, P. W., and Laham, D. (1998). An introduction to latent semanticanalysis.Discourse processes, 25(2-3):259–284.

Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on HumanLanguage Technologies, 5(1):1–167.

Liu, J., Cao, Y., Lin, C.-Y., Huang, Y., and Zhou, M. (2007). Low-quality product reviewdetection in opinion summarization. In Proceedings of the 2007 Joint Conferenceon Empirical Methods in Natural Language Processing and Computational NaturalLanguage Learning (EMNLP-CoNLL).

Malik, M. and Hussain, A. (2017). Helpfulness of product reviews as a function of discrete positive and negative emotions.Computers in Human Behavior, 73:290–302.

Martins, A. C. S. and Tacla, C. A. (2015). Assessement of features influencing the votingfor opinions’ helpfulness about services in portuguese. In Proceedings of the annual conference on Brazilian Symposium on Information Systems: Information Systems: A Computer Socio-Technical Perspective-Volume 1, page 21. Brazilian Computer Society.

Orengo, V. and Huyck, C. (2001). A stemming algorithmm for the portuguese language.InString Processing and Information Retrieval, pages 186–193.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau,D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learningin Python.Journal of Machine Learning Research, 12:2825–2830.

Rossi, R. G. (2016).Classificação automática de textos por meio de aprendizado dem ́aquina baseado em redes. PhD thesis, Universidade de São Paulo.

Santos, R. L. d. S., de Sousa, R. F., Rabelo, R. A., and Moura, R. S. (2016). An experimental study based on fuzzy systems and artificial neural networks to estimate theimportance of reviews about product and services. In 2016 International Joint Conference on Neural Networks (IJCNN), pages 647–653. IEEE.

Scarton, C. E. and Aluísio, S. M. (2010). Análise da inteligibilidade de textos via ferramentas de processamento de língua natural: adaptando as métricas do coh-metrix parao português. Linguamática, 2(1):45–61.

Semin, G. R. (2011). The linguistic category model.Handbook of theories of socialpsychology, 1:309–326.

Singh, J. P., Irani, S., Rana, N. P., Dwivedi, Y. K., Saumya, S., and Roy, P. K. (2017). Predicting the “helpfulness” of online consumer reviews. Journal of Business Research,70:346–355.

Sousa, R. F., Brum, H. B., and Nunes, M. d. G. V. (2019). A bunch of helpfulness and sentiment corpora in brazilian portuguese. InProceedings of Symposium in Informationand Human Language Technology - STIL. Sociedade Brasileira de Computação.

Zeng, Y.-C., Ku, T., Wu, S.-H., Chen, L.-P., and Chen, G.-D. (2014). Modeling the helpfulopinion mining of online consumer reviews as a classification problem. International Journal of Computational Linguistics & Chinese Language Processing, Volume 19, Number 2, June 2014, 19(2).
Zhou, D., Bousquet, O., Lal, T. N., Weston, J., and Scholkopf, B. (2004). Learning with local and global consistency. In Advances in neural information processing systems, pages 321–328.

Zhu, X., Ghahramani, Z., and Lafferty, J. D. (2003). Semi-supervised learning using gaussian fields and harmonic functions. In Proceedings of the 20th International conferenceon Machine learning (ICML-03), pages 912–919.




How to Cite

de Sousa, R. F., Anchiêta, R. T., & Nunes, M. das G. V. (2020). A Graph-Based Method for Predicting the Helpfulness of Product Opinions. ISys - Brazilian Journal of Information Systems, 13(4), 06–21. https://doi.org/10.5753/isys.2020.821



Extended versions of selected articles