Investigando Features de Sentenças para Classificação de Subjetividade e Polaridade em Português do Brasil

Miguel Oliveira; Tiago Melo

doi:10.5753/eniac.2020.12135

Miguel Oliveira Universidade do Estado do Amazonas
Tiago Melo Universidade do Estado do Amazonas

DOI: https://doi.org/10.5753/eniac.2020.12135

Resumo

Identificar sentenças subjetivas e calcular a polaridade destas sentenças são duas importantes tarefas de análise de sentimentos. Apesar de não serem problemas novos, a maioria das soluções são voltadas para o idioma inglês. Neste artigo, nós propusemos e avaliamos uma abordagem baseada em aprendizagem de máquina para lidar com as duas tarefas em português. Nós investigamos o uso de dois modelos de classificação e também propusemos um conjunto de características linguísticas do próprio texto. Nós experimentamos e avaliamos os métodos contra um representativo conjunto de baselines e em um conjunto diversificado de datasets. Nossa abordagem alcançou os melhores resultados nas duas tarefas e em todos os conjuntos de dados de teste.

Palavras-chave: análise de sentimentos, processamento de linguagem natural, features textuais

Referências

Araújo, M., Gonçalves, P., Cha, M., and Benevenuto, F. (2014). ifeel: a system that compares and combines sentiment analysis methods. In Proceedings of the 23rd International Conference on World Wide Web, pages 75–78.

Baeza-Yates, R., Ribeiro-Neto, B., et al. (1999). Modern information retrieval, volume 463. ACM press New York.

Belisário, L. B., Ferreira, L. G., and Pardo, T. A. S. (2020). Evaluating methods of different paradigms for subjectivity classification in portuguese. In International Conference on Computational Processing of the Portuguese Language, pages 261–269. Springer.

Britto, L. and Pacı́fico, L. (2019). Análise de sentimentos para revisões de aplicativos mobile em português brasileiro. In Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, pages 1080–1090, Porto Alegre, RS, Brasil. SBC.

Cambria, E., Speer, R., Havasi, C., and Hussain, A. (2010). Senticnet: A publicly available semantic resource for opinion mining. In 2010 AAAI fall symposium series.

Cortes, C. and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3):273– 297.

De Smedt, T. and Daelemans, W. (2012). Pattern for python. The Journal of Machine Learning Research, 13(1):2063–2067. dos Santos, F. L. and Ladeira, M. (2014). The role of text pre-processing in opinion mining on a social media language dataset. In 2014 Brazilian Conference on Intelligent Systems, pages 50–54. IEEE.

Dosciatti, M. M., Ferreira, L., and Paraiso, E. (2013). Identificando emoções em textos em português do brasil usando máquina de vetores de suporte em soluçao multiclasse. ENIAC-Encontro Nacional de Inteligência Artificial e Computacional. Fortaleza, Brsil.

Esuli, A. and Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In LREC, volume 6, pages 417–422. Citeseer.

Freitas, C., Motta, E., Milidiú, R. L., and César, J. (2014). Sparkling vampire... lol! antating opinions in a book review corpus. New Language Technologies and Linguistic Research: A Two-Way Road, pages 128–146.

Go, A., Bhayani, R., and Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N project report, Stanford, 1(12):2009.

Gonçalves, P., Araújo, M., Benevenuto, F., and Cha, M. (2013). Comparing and cbining sentiment analysis methods. In Proceedings of the first ACM conference on Online social networks, pages 27–38.

Hassonah, M. A., Al-Sayyed, R., Rodan, A., Ala’M, A.-Z., Aljarah, I., and Faris, H. (2020). An efficient hybrid filter and evolutionary wrapper approach for sentiment analysis of various topics on twitter. Knowledge-Based Systems, 192:105353.

Liu, B. (2011). Opinion mining and sentiment analysis. In Web Data Mining, pages 459–526. Springer.

Liu, B. (2015). Sentiment analysis: Mining opinions, sentiments, and emotions. Cambridge University Press.

Mohammad, S. and Turney, P. (2010). Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon. In Proceedings of the NAACL HLT 2010 workshop on computational approaches to analysis and generation of emotion in text, pages 26–34.

Moraes, S. M., Santos, A. L., Redecker, M., Machado, R. M., and Meneguzzi, F. R. (2016). Comparing approaches to subjectivity classification: A study on portuguese tweets. In International Conference on Computational Processing of the Portuguese Language, pages 86–94. Springer.

Quinlan, J. R. (1996). Boosting first-order learning. In International Workshop on Algorithmic Learning Theory, pages 143–155. Springer.

Ravi, K. and Ravi, V. (2015). A survey on opinion mining and sentiment analysis: tasks, approaches and applications. Knowledge-Based Systems, 89:14–46.

Ribeiro, F. N., Araújo, M., Gonçalves, P., Gonçalves, M. A., and Benevenuto, F. (2016). Sentibench-a benchmark comparison of state-of-the-practice sentiment analysis methods. EPJ Data Science, 5(1):1–29.

Schouten, K. and Frasincar, F. (2015). Survey on aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(3):813–830.

Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A. Y., and Potts, C. (2013). Recursive deep models for semantic compositionality over a sentiment treebank. In Proceedings of the 2013 conference on empirical methods in natural language processing, pages 1631–1642.

Stiilpen Júnior, M. (2016). Um arcabouço de processamento de textos informais em ptuguês brasileiro para aplicações de mineração de dados. Master’s thesis, Universidade Federal de Ouro Preto.

Thelwall, M. (2017). The heart and soul of the web? sentiment strength detection in the social web with sentistrength. In Cyberemotions, pages 119–134. Springer.

Yue, L., Chen, W., Li, X., Zuo, W., and Yin, M. (2019). A survey of sentiment analysis in social media. Knowledge and Information Systems, pages 1–47.