Lira, Thiago, Flávio Cação, Cinthia Souza, João Valentini, Edson Bollis, Otavio Oliveira, Renato Almeida, Marcio Magalhães, Katia Poloni, Andre Oliveira, and Lucas Pellicer. " Aroeira: A Curated Corpus for the Portuguese Language with a Large Number of Tokens." Anais da XXXIV Brazilian Conference on Intelligent Systems, Belém/PA, 2024. SBC, 2024, pp.185-199.