Thiago Lira et al. 2024. Aroeira: A Curated Corpus for the Portuguese Language with a Large Number of Tokens. In Anais da XXXIV Brazilian Conference on Intelligent Systems, novembro 17, 2024, Belém/PA, Brasil. SBC, Porto Alegre, Brasil, 185-199.