Do you know what your senator advocates for in the committees they participate in? An LLM-based approach to topic and stance detection in parliamentary discussions

Helen Bento Cavalcanti; Claudio E. C. Campelo

doi:10.5753/sbbd_estendido.2024.243776

Helen Bento Cavalcanti Universidade Federal de Campina Grande http://orcid.org/0009-0008-1929-7808
Claudio E. C. Campelo Universidade Federal de Campina Grande

DOI: https://doi.org/10.5753/sbbd_estendido.2024.243776

Resumo

The legislative power in Brazil faces challenges in making discussions more accessible to the population, which is essential for strengthening democracy. Although stenographic notes of the Senate and House of Representatives committee meetings are publicly available, their length and volume make it impractical for citizens to follow what actually happens in those meetings. Therefore, a tool that can automatically extract useful and summarized information from these discussions would be transformative, empowering voters to monitor their representatives more effectively. This study investigates the efficacy of Large Language Models (LLMs) for detecting relevant topics and stances of parliamentarians. We conducted experiments using GPT-3.5-Turbo to interpret shorthand notes from the Federal Senate in 2023. The results were promising, with an average accuracy of 70% and 60% for topic and stance detection, respectively.

Palavras-chave: Senator, LLM, Topic Modeling

Referências

Bostrom, K. and Durrett, G. (2020). Byte pair encoding is suboptimal for language model pretraining. arXiv preprint arXiv:2004.03720.

Cavalcanti, H. and Campelo, C. (2024). Dataset of brazilian federal senate session transcriptions from 2023 with relevant topics and stance detection annotations.

dos Santos, M. A. (2024). Modelagem de tópicos na estimativa de pontos ideais baseados em discursos de parlamentares.

Jiang, H., Wu, Q., Luo, X., Li, D., Lin, C.-Y., Yang, Y., and Qiu, L. (2023). Longllmlingua: Accelerating and enhancing llms in long context scenarios via prompt compression. arXiv preprint arXiv:2310.06839.

Pojoni, M.-L., Dumani, L., and Schenkel, R. (2023). Argument-mining from podcasts using chatgpt. In In procs. of the Workshops at International Conference on Case-Based Reasoning (ICCBR-WS 2023) co-located with the 31st International Conference on Case-Based Reasoning (ICCBR 2023), Aberdeen, Scotland, UK, volume 3438, pages 129–144.

Reuver, M., Verberne, S., and Fokkens, A. (2024). Investigating the robustness of modelling decisions for few-shot cross-topic stance detection: A preregistered study.

Santos, P. D. and Goya, D. H. (2021). Automatic twitter stance detection on politically controversial issues: A study on covid-19’s cpi. In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional, pages 524–535. SBC.

İlker Gül, Lebret, R., and Aberer, K. (2024). Stance detection on social media with fine-tuned large language models.