Uma Abordagem Multilíngue para Análise de Sentimentos
Resumo
A análise de sentimentos tornou-se uma ferramenta essencial para aplicação em diversos contextos, incluindo análise de opinião do usuário sobre produtos e serviços, previsão durante campanhas políticas e até mesmo em tendências do mercado financeiro. Apesar do grande interesse neste tema e na quantidade de pesquisas na área, a maioria dos métodos foram projetados para funcionar com o conteúdo em inglês. Neste estudo, nos direcionamos a preencher esta lacuna propondo uma abordagem para o uso de determinados métodos estado-da-arte para análise de sentimentos em 9 diferentes línguas. Para isto, nós utilizamos bases de dados previamente rotuladas em cada idioma e uma simples tradução automática para o inglês e desenvolvemos uma metodologia para comparar e validar os resultados. Nossos resultados demonstram o potencial desta abordagem para tornar a análise de sentimentos independente da língua inglesa.
Referências
Abdel-Hady, M., Mansour, R., and Ashour, A. (2014). Cross-lingual twitter polarity detection via projection across word-aligned corpora. ICML WISDOM 2014 Conference.
Abdul-Mageed, M., Kübler, S., and Diab, M. (2012). Samar: A system for subjectivity and sentiment analysis of arabic social media. In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pages 19–28.
Association for Computational Linguistics.
Abdulla, N., Ahmed, N., Shehab, M., and Al-Ayyoub, M. (2013). Arabic sentiment analysis: Lexicon-based and corpus-based. In Applied Electrical Engineering and Computing Technologies (AEECT), 2013 IEEE Jordan Conference on, pages 1–6.
Amazon (2005). Amazon mechanical turk. https://www.mturk.com/. Accessed June 17, 2013.
Araújo, M., Gonçalves, P., Cha, M., and Benevenuto, F. (2014). ifeel: A system that compares and combines sentiment analysis methods. In Proceedings of the companion publication of the 23rd international conference on World wide web companion, pages 75–78.
Balahur, A. and Turchi, M. (2012). Multilingual sentiment analysis using machine translation? In Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis, pages 52–60. Association for Computational Linguistics.
Banea, C., Mihalcea, R., Wiebe, J., and Hassan, S. (2008). Multilingual subjectivity analysis using machine translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 127–135. Association for Computational Linguistics.
Bollen, J., Pepe, A., and Mao, H. (2009). Modeling Public Mood and Emotion: Twitter Sentiment and Socio-Economic Phenomena. CoRR, abs/0911.1583.
Bradley, M. M. and Lang, P. J. (1999). Affective norms for English words (ANEW): Stimuli, instruction manual, and affective ratings. Technical report, Center for Research in Psychophysiology, University of Florida.
Cambria, E., Speer, R., Havasi, C., and Hussain, A. (2010). Senticnet: A publicly available semantic resource for opinion mining. In AAAI Fall Symposium Series.
Demirtas, E. and Pechenizkiy, M. (2013). Cross-lingual polarity detection with machine translation. In Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining, WISDOM ’13, pages 9:1–9:8. ACM.
Dodds, P. and Danforth, C. (2010). Measuring the happiness of large-scale written expression: Songs, blogs, and presidents. J. Of Happiness Studies, 11.
Dodds, P. S. and Danforth, C. M. (2009). Measuring the happiness of large-scale written expression: songs, blogs, and presidents. J. of Happiness Studies, 11.
dos Santos, A. G. L., Becker, K., and Moreira, V. (2014). Um estudo de caso de mineração de emoções em textos multilíngues. In Proceedings of the Brazilian Workshop on Social Network Analysis and Mining (BraSNAM’13).
Esuli and Sebastiani (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In Proc. LREC.
Goncalves, P., Araujo, M., Benevenuto, F., and Cha, M. (2013). Comparing and combining sentiment analysis methods. In Proc. COSN.
Goncalves, P. and Benevenuto, F. (2013). O que tweets contendo emoticons podem revelar sobre sentimentos coletivos? In II Brazilian Workshop on Social Network Analysis and Mining (BraSNAM).
Goncalves, P., Dores, W., and Benevenuto, F. (2012). Panas-t: Uma escala psicometrica para analise de sentimentos no twitter. In Brazilian Workshop on Social Network Analysis and Mining (BraSNAM).
Google. Google translate. https://translate.google.com. Acessado em Março de 2015.
Hu, M. and Liu, B. (2004). Mining and summarizing customer reviews. Proc. KDD’04, pages 168–177.
Hutto, C. J. and Gilbert, E. (2014). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In ICWSM.
Lin, Z., Tan, S., and Cheng, X. (2011). Language-independent sentiment classification using three common words. In Proceedings of the 20th ACM international conference on Information and knowledge management, pages 1041–1046. ACM.
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1):1–167.
Messenger, Y. (2014). Yahoo messenger emoticons. http://messenger.yahoo.com/features/emoticons.
Miller, G. A. (1995). Wordnet: a lexical database for english. Communications of the ACM, 38.
Mohammad, S. (2012). #emotional tweets. In SEM.
Mohammad, S. M., Kiritchenko, S., and Zhu, X. (2013). Nrc-canada: Building the state-of-the-art in sentiment analysis of tweets. In Proc. SemEval-2013.
Mohammad, S. M. and Turney, P. D. Emotions evoked by common words and phrases: Using mechanical turk to create an emotion lexicon.
MSGWeb (2006). List of emoticons in msn messenger. http://messenger.msn.com/Resource/Emoticons.aspx.
Narayanan, R., Liu, B., and Choudhary, A. (2009). Sentiment analysis of conditional sentences. In Proc. of the Conference on Empirical Methods in Natural Language Processing: Volume 1, pages 180–189. Association for Computational Linguistics.
Narr, S., Hulfenhaus, M., and Albayrak, S. (2012). Language-independent twitter sentiment analysis. Knowledge Discovery and Machine Learning (KDML), LWA, pages 12–14.
Online, N. (2009). Social networks and blogs now 4th most popular online activity, ahead of personal email, nielsen reports. http: //www.nielsen.com/us/en/press-room/2009/social_networks__.html. Accessed in May, 05, 2014.
Plutchik, R. (1980). A general psychoevolutionary theory of emotion, pages 3–33. Academic press, New York.
Remus, R., Quasthoff, U., and Heyer, G. (2010). Sentiws-a publicly available german-language resource for sentiment analysis. In LREC.
Sascha Narr, M. H. and Albayrak, S. (2012). Language-independent twitter sentiment analysis. In Workshop on Knowledge Discovery, Data Mining and Machine Learning (KDML-2012), Dortmund, Germany.
Souza, M. and Vieira, R. (2012). Sentiment analysis on twitter data for portuguese language. In Computational Processing of the Portuguese Language, pages 241–247. Springer.
Stone, P. J., Dunphy, D. C., Smith, M. S., and Ogilvie, D. M. (1966). The General Inquirer: A Computer Approach to Content Analysis. MIT Press.
Tausczik, Y. R. and Pennebaker, J. W. (2010). The psychological meaning of words: Liwc and computerized text analysis methods. J. of Lang. and Soc. Psych., 29.
Thelwall, M. (2013). Heart and soul: Sentiment strength detection in the social web with sentistrength. http://sentistrength.wlv.ac.uk/documentation/SentiStrengthChapter.pdf.
Valitutti, R. (2004). Wordnet-affect: an affective extension of wordnet. In In Proceedings of the 4th International Conference on Language Resources and Evaluation.
Wang, H., Can, D., Kazemzadeh, A., Bar, F., and Narayanan, S. (2012). A system for real-time twitter sentiment analysis of 2012 u.s. presidential election cycle. In ACL System Demonstrations.
Watson, D. and Clark, L. (1985). Development and validation of brief measures of positive and negative affect: the panas scales. J. Of Pers. and So. Psych., 54.
Yussupova, N., Bogdanova, D., and Boyko, M. (2012). Applying of sentiment analysis for texts in russian based on machine learning approach. In IMMM 2012, 2nd International Conference on Advances in Information Mining and Management, pages 8–14.