Uma Abordagem de Classificação de Sentimentos em Revisões de Livros em Português Brasileiro Usando Diferentes Métodos de Extração de Características

  • Larissa Britto Universidade Federal Rural de Pernambuco
  • Luciano Pacífico Universidade Federal Rural de Pernambuco

Resumo


A enorme quantidade de dados textuais disponibilizados todos os dias na internet incentivou a pesquisa em diversas áreas que processam e analisam automaticamente esses textos. Uma das áreas mais populares é Análise de Sentimentos, que apesar de ter sido um tópico amplamente discutido nos últimos anos, ainda enfrenta uma escassez de recursos disponíveis para o idioma português brasileiro. Este trabalho apresenta o processo completo de análise e classificação de sentimentos, desde o desenvolvimento de uma base de dados em português (no domínio de livros) até a classificação de sentimentos utilizando alguns dos principais classificadores da literatura e diferentes métodos de extração de características.

Palavras-chave: Análise de Sentimentos, Processamento de Linguagem Natural, Aprendizagem de Máquina, Extração de Características em Textos

Referências

Abbas, M., Ali, K., Memon, S., Jamali, A., and Ahmed, A. (2019). Multinomial naive bayes classification model for sentiment analysis.

Ahmad, M., Aftab, S., and Ali, I. (2017). Sentiment analysis of tweets using svm.

Al Omari, M., Al-Hajj, M., Hammami, N., and Sabra, A. (2019). Sentiment classifier: Logistic regression for arabic services’ reviews in lebanon. In 2019 International Conference on Computer and Information Sciences (ICCIS), pages 1–5.

Bayhaqy, A., Sfenrianto, S., Nainggolan, K., and Kaburuan, E. R. (2018). Sentiment analysis about e-commerce from tweets using decision tree, k-nearest neighbor, and naı̈ve bayes. In 2018 International Conference on Orange Technologies (ICOT), pages 1–6.

Bergsma, S., Jung, D., Lau, R., Wang, Y., and Wang, S. (2005). Machine learning approaches to sentiment classification cmput 551 : Course project winter , 2005.

Bilal, M., Israr, H., Shahid, M., and Khan, A. (2016). Sentiment classification of romanurdu opinions using naı̈ve bayesian, decision tree and knn classification techniques. Journal of King Saud University - Computer and Information Sciences, 28(3):330 – 344.

Blitzer, J., McDonald, R., and Pereira, F. (2006). Domain adaptation with structural correspondence learning. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06, pages 120–128, Stroudsburg, PA, USA. Association for Computational Linguistics.

Britto, L. F. S. and Pacı́fico, L. D. S. (2019). Análise de sentimentos para revisões de aplicativos mobile em português brasileiro. In Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, pages 1080–1090, Porto Alegre, RS, Brasil. SBC.

Brum, H. B. and das Graças Volpe Nunes, M. (2017). Building a sentiment corpus of tweets in brazilian portuguese. CoRR, abs/1712.08917.

Criminisi, A., Konukoglu, E., and Shotton, J. (2011). Decision forests for classification, regression, density estimation, manifold learning and semi-supervised learning. de Aguiar, E. J., Faiçal, B. S., Ueyama, J., Silva, G. C., and Menolli, A. (2018). Análise de sentimento em redes sociais para a lı́ngua portuguesa utilizando algoritmos de classificação. In Anais do XXXVI Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuı́dos, Porto Alegre, RS, Brasil. SBC.

de Souza, J. G. R., de Paiva Oliveira, A., and Moreira, A. (2018). Development of a brazilian portuguese hotel’s reviews corpus. In PROPOR.

Dubey, P., Mishra, A., and Saha, B. K. (2019). Sentiment analysis using svm and deep neural network. In 2019 3rd International Conference on Computing Methodologies and Communication (ICCMC), pages 952–957.

Fang, X. and Zhan, J. (2015). Sentiment analysis using product review data. J Big Data, 2.

Farias, D. H. and Rosso, P. (2017). Chapter 7 - irony, sarcasm, and sentiment analysis. In Pozzi, F. A., Fersini, E., Messina, E., and Liu, B., editors, Sentiment Analysis in Social Networks, pages 113 – 128. Morgan Kaufmann, Boston.

Fikri, M. and Sarno, R. (2019). A comparative study of sentiment analysis using svm and sentiwordnet.

Guan, X., Li, Y., Gong, H., Sun, H., and Zhou, C. (2018). An improved svm for book review sentiment polarity analysis. In 2018 International Conference on Transportation Logistics, Information Communication, Smart City (TLICSC 2018). Atlantis Press.

Hegde, Y. and Padma, S. K. (2017). Sentiment analysis using random forest ensemble for mobile product reviews in kannada. In 2017 IEEE 7th International Advance Computing Conference (IACC), pages 777–782.

Jose, R. and Chooralil, V. S. (2016). Prediction of election result by enhanced sentiment analysis on twitter data using classifier ensemble approach. In 2016 International Conference on Data Mining and Advanced Computing (SAPIENCE), pages 64–67.

Lu, K. and Wu, J. (2019). Sentiment analysis of film review texts based on sentiment dictionary and svm. In Proceedings of the 2019 3rd International Conference on Innovation in Artificial Intelligence, ICIAI 2019, pages 73–77, New York, NY, USA. ACM.

Moraes, S. M. W., Manssour, I. H., and Silveira, M. S. (2015). 7x1-PT: um corpus extraı́do do twitter para análise de sentimentos em lı́ngua portuguesa (7x1-PT: a corpus extracted from twitter for sentiment analysis in Portuguese language). In Proceedings of the 10th Brazilian Symposium in Information and Human Language Technology, pages 21–25, Natal, Brazil. Sociedade Brasileira de Computação.

Oliveira, D. J. A. S., Bermejo, P. H. d. S., Pereira, J. A. R., and Barbosa, D. A. (2019). A aplicação da técnica de análise de sentimento em mı́dias sociais como instrumento para as práticas da gestão social em nı́vel governamental. Revista de Administração Pública, 53:235 – 251.

Pacifico, L. D. S., Macario, V., and Oliveira, J. F. L. (2018). Plant classification using artificial neural networks. In 2018 International Joint Conference on Neural Networks (IJCNN), pages 1–6.

Pan, S. J., Ni, X., Sun, J.-T., Yang, Q., and Chen, Z. (2010). Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web, WWW ’10, pages 751–760, New York, NY, USA. ACM.

Parmar, H., Bhanderi, S., and Shah, G. (2014). Sentiment mining of movie reviews using random forest with tuned hyperparameters.

Pervan, N. and Keles, H. (2017). Sentiment analysis using a random forest classifier on turkish web comments. 59.

Prasad, S. S., Kumar, J., Prabhakar, D. K., and Pal, S. (2015). Sentiment classification: An approach for indian language tweets using decision tree. In Prasath, R., Vuppala, A. K., and Kathirvalavakumar, T., editors, Mining Intelligence and Knowledge Exploration, pages 656–663, Cham. Springer International Publishing.

Ramadhan, W. P., Novianty, S. T. M. T. A., and Setianingsih, S. T. M. T. C. (2017). Sentiment analysis using multinomial logistic regression. In 2017 International Conference on Control, Electronics, Renewable Energy and Communications (ICCREC), pages 46–49.

Rane, A. and Kumar, A. (2018). Sentiment classification system of twitter data for us airline service analysis. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), volume 01, pages 769–773.

Rathi, M., Malik, A., Varshney, D., Sharma, R., and Mendiratta, S. (2018). Sentiment analysis of tweets using machine learning approach. In 2018 Eleventh International Conference on Contemporary Computing (IC3), pages 1–3.

Schäfer, R. and Bildhauer, F. (2015). Web corpus construction roland schäfer and felix bildhauer (freie universität berlin) morgan claypool (synthesis lectures on human laguage technologies, edited by graeme hirst, volume 22), 2013, 145 pages, paper-bound, isbn 9781608459834, doi:10.2200/s00508ed1v01y201305hlt022. Computational Linguistics, 41:161–163.

Souza, M. and Vieira, R. (2012). Sentiment analysis on twitter data for portuguese language. In Caseli, H., Villavicencio, A., Teixeira, A., and Perdigão, F., editors, Comptational Processing of the Portuguese Language, pages 241–247, Berlin, Heidelberg. Springer Berlin Heidelberg.

Turney, P. D. (2002). Thumbs up or thumbs down?: Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 417–424, Stroudsburg, PA, USA. Association for Computational Linguistics.

Tyagi, A. and Sharma, N. (2018). Sentiment analysis using logistic regression and effective word score heuristic. International Journal of Engineering and Technology(UAE), 7:20–23.

Vapnik, V. (1991). Principles of risk minimization for learning theory. In Proceedings of the 4th International Conference on Neural Information Processing Systems, NIPS’91, pages 831–838, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

Vargiu, E. and Urru, M. (2012). Exploiting web scraping in a collaborative filtering- based approach to web advertising. Artif. Intell. Research, 2:44–54.

Yu, T. and Nwet, K. T. (2020). Sentiment analysis system for myanmar news using support vector machine and naı̈ve bayes. In Pan, J.-S., Lin, J. C.-W., Liang, Y., and Chu, S.-C., editors, Genetic and Evolutionary Computing, pages 551–557, Singapore. Springer Singapore.

Ziser, Y. and Reichart, R. (2016). Neural structural correspondence learning for domain adaptation. CoRR, abs/1610.01588.

Zuo, Z. (2018). Sentiment analysis of steam review datasets using naive bayes and decision tree classifier.
Publicado
20/10/2020
BRITTO, Larissa; PACÍFICO, Luciano. Uma Abordagem de Classificação de Sentimentos em Revisões de Livros em Português Brasileiro Usando Diferentes Métodos de Extração de Características. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 17. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 116-127. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2020.12122.

Artigos mais lidos do(s) mesmo(s) autor(es)