Análise da Evolução dos Discursos de Pré-candidatos à Presidente por meio de Representações Linguísticas Vetoriais
Resumo
Comumente, os pré-candidatos aos cargos governamentais expressam suas opiniões e plataformas de campanha em discursos informais, previamente ao período oficial. Esse comportamento é essencial para que o eleitor conheça as ideologias e plataformas de campanha, de forma a tomar sua decisão de voto. No processo decisório, o eleitor pode considerar a semelhança entre discursos de diferentes candidatos, como o discurso varia ao longo do tempo, e qual a adequação do discurso aos temas mais relevantes para a sociedade. Entretanto, analisar e capturar tais aspectos a partir dos discursos informais é uma tarefa difícil para o eleitor, dado o volume de informação disponibilizada por diversos veículos de comunicação, e o viés político de alguns deles. Assim, nesse artigo, propomos uma ferramenta de análise de discurso político baseada em técnicas de Aprendizado de Representações Linguísticas para auxiliar o eleitor na sua decisão. Resultados obtidos a partir dos discursos dos pré-candidatos ao cargo de Presidente do Brasil em 2018 permitem verificar como os candidatos se comportam em termos de seus próprios discursos e dos discursos de seus concorrentes.
Palavras-chave:
doc2vec, natural language processing, discourse analysis
Referências
Azarbonyad, H., Dehghani, M., Beelen, K., Arkut, A., Marx, M., and Kamps, J. Words are malleable: Computing semantic shifts in political and media discourse. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. ACM, pp. 1509–1518, 2017.
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. A neural probabilistic language model. Journal of machine learning research 3 (Feb): 1137–1155, 2003.
Bird, S. and Loper, E. Nltk: the natural language toolkit. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions. Association for Computational Linguistics, pp. 31, 2004.
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12 (Aug): 2493–2537, 2011.
Dai, A. M., Olah, C., and Le, Q. V. Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998 , 2015.
Gautrais, C., Cellier, P., Quiniou, R., and Termier, A. Topic signatures in political campaign speeches. In EMNLP 2017-Conference on Empirical Methods in Natural Language Processing, 2017.
Greene, D. and Cross, J. P. Exploring the political agenda of the european parliament using a dynamic topic modeling approach. Political Analysis 25 (1): 77–94, 2017.
Guthrie, D., Allison, B., Liu, W., Guthrie, L., and Wilks, Y. A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006). pp. 1–4, 2006.
Harris, Z. S. Distributional structure. Word 10 (2-3): 146–162, 1954.
Lau, J. H. and Baldwin, T. An empirical evaluation of doc2vec with practical insights into document embedding generation. In Proceedings of the 1st Workshop on Representation Learning for NLP. pp. 78–86, 2016.
Le, Q. and Mikolov, T. Distributed representations of sentences and documents. In International Conference on Machine Learning. pp. 1188–1196, 2014.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. pp. 3111–3119, 2013.
Mikolov, T., Yih, W.-t., and Zweig, G. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 746–751, 2013.
Pennington, J., Socher, R., and Manning, C. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp. 1532–1543, 2014.
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. Learning representations by back-propagating errors. nature 323 (6088): 533, 1986.
Zhang, Y., Jin, R., and Zhou, Z.-H. Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1 (1-4): 43–52, 2010.
Bengio, Y., Ducharme, R., Vincent, P., and Jauvin, C. A neural probabilistic language model. Journal of machine learning research 3 (Feb): 1137–1155, 2003.
Bird, S. and Loper, E. Nltk: the natural language toolkit. In Proceedings of the ACL 2004 on Interactive poster and demonstration sessions. Association for Computational Linguistics, pp. 31, 2004.
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., and Kuksa, P. Natural language processing (almost) from scratch. Journal of Machine Learning Research 12 (Aug): 2493–2537, 2011.
Dai, A. M., Olah, C., and Le, Q. V. Document embedding with paragraph vectors. arXiv preprint arXiv:1507.07998 , 2015.
Gautrais, C., Cellier, P., Quiniou, R., and Termier, A. Topic signatures in political campaign speeches. In EMNLP 2017-Conference on Empirical Methods in Natural Language Processing, 2017.
Greene, D. and Cross, J. P. Exploring the political agenda of the european parliament using a dynamic topic modeling approach. Political Analysis 25 (1): 77–94, 2017.
Guthrie, D., Allison, B., Liu, W., Guthrie, L., and Wilks, Y. A closer look at skip-gram modelling. In Proceedings of the 5th international Conference on Language Resources and Evaluation (LREC-2006). pp. 1–4, 2006.
Harris, Z. S. Distributional structure. Word 10 (2-3): 146–162, 1954.
Lau, J. H. and Baldwin, T. An empirical evaluation of doc2vec with practical insights into document embedding generation. In Proceedings of the 1st Workshop on Representation Learning for NLP. pp. 78–86, 2016.
Le, Q. and Mikolov, T. Distributed representations of sentences and documents. In International Conference on Machine Learning. pp. 1188–1196, 2014.
Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., and Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. pp. 3111–3119, 2013.
Mikolov, T., Yih, W.-t., and Zweig, G. Linguistic regularities in continuous space word representations. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 746–751, 2013.
Pennington, J., Socher, R., and Manning, C. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). pp. 1532–1543, 2014.
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. Learning representations by back-propagating errors. nature 323 (6088): 533, 1986.
Zhang, Y., Jin, R., and Zhou, Z.-H. Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics 1 (1-4): 43–52, 2010.
Publicado
22/10/2018
Como Citar
VALERIANO, Kid; PAES, Aline; DE OLIVEIRA, Daniel.
Análise da Evolução dos Discursos de Pré-candidatos à Presidente por meio de Representações Linguísticas Vetoriais. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 6. , 2018, São Paulo/SP.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2018
.
p. 81-88.
ISSN 2763-8944.
DOI: https://doi.org/10.5753/kdmile.2018.27388.