Aplicação de Modelos de Tópicos em análises automatizadas de discursos de senadores brasileiros

  • Victor Landim Teixeirense Pinheiro UnB
  • Thiago de Paulo Faleiros UnB

Abstract


In this work, we intend to apply topics models technique and evaluate the results in order to obtain information relating to the senator’s speeches and their thematic structure over time. It is understood that there is an overflow of information in our society. In this context, an automated approach to analysis can bring out patterns in large collections of data. One of those approaches is Topic Modeling. This tool typically outputs topics from collections of documents. Topics are a set of words that describe a clear semantic concept. In this regard, it is desired to extract patterns from the topics created from the large collection of Brazillian senator speeches, provided by the Federal Senate. The main hypothesis is that there is a correlation between the historical topic evolution and historical, political, and economic events. The results are matched with draft bills, relevant dates, and news articles. Thus, this work can contribute with transparency to Brazillian citizens regarding the patterns found in their politicians’ speeches. Ultimately, this work can be extended with the evaluation of more modern implementations of Topic Models.

References

Beghin, N. and Zigoni, C. (2014). Avaliando os websites de transparência orçamentária nacionais e sub-nacionais e medindo impactos de dados abertos sobre direitos humanos no brasil. Instituto de Estudos Socioeconômicos.

Blei, D. M. (2012). Probabilistic topic models, surveying a suite of algorithms that offer a solution to managing large document archives. Communications of the acm.

Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent dirichlet allocation. J. Mach. Learn. Res., 3(null):993-1022.

Boyd-Graber, J. (2017). Applications of topic models. Department of Computer Science, umiacs, Language Science - University of Maryland.

Boyd-Graber, J., Mimno, D., and Newman, D. (2014). Care and Feeding of Topic Models: Problems, Diagnostics, and Improvements. CRC Handbooks of Modern Statistical Methods. CRC Press, Boca Raton, Florida.

Chang, J., Gerrish, S., Wang, C., Boyd-graber, J., and Blei, D. (2009). Reading tea leaves: How humans interpret topic models. In Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C., and Culotta, A., editors, Advances in Neural Information Processing Systems, volume 22. Curran Associates, Inc.

Lau, J. H., Newman, D., and Baldwin, T. (2014). Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 530-539, Gothenburg, Sweden. Association for Computational Linguistics.

Moreira, D. (2020). Com a palavra os nobres deputados: ênfase temática dos discursos dos parlamentares brasileiros. DADOS, Rio de Janeiro.

Newman, D., Karimi, S., and Cavedon, L. (2011). External evaluation of topic models. ADCS 2009 Proceedings of the Fourteenth Australasian Document Computing Symposium.

Thiago de Paulo Faleiros, A. d. A. L. (2016). Modelos probabilÍsticos de tÓpicos: Desvendando o latent dirichlet allocation. Instituto de Ciências Matemáticas e de Computação (ICMC/USP), São Carlos - SP.

Wallach, H. M., Murray, I., Salakhutdinov, R., and Mimno, D. (2012). Evaluation methods for topic models. Communications of the acm.
Published
2022-11-28
PINHEIRO, Victor Landim Teixeirense; FALEIROS, Thiago de Paulo. Aplicação de Modelos de Tópicos em análises automatizadas de discursos de senadores brasileiros. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 19. , 2022, Campinas/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 612-623. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2022.227634.