An Ensemble of LLMs Finetuned with LoRA for NER in Portuguese Legal Documents

  • Rafael Oleques Nunes UFRGS
  • Letícia Maria Puttlitz UFRGS
  • Antonio Oss Boll UFRGS
  • Andre Spritzer UFRGS
  • Carla Maria Dal Sasso Freitas UFRGS
  • Dennis Giovani Balreira UFRGS
  • Anderson Rocha Tavares UFRGS

Resumo


Given the high computational costs of traditional fine-tuning methods and the goal of improving performance,this study investigate the application of low-rank adaptation (LoRA) for fine-tuning BERT models to Portuguese Legal Named Entity Recognition (NER) and the integration of Large Language Models (LLMs) in an ensemble setup. Focusing on the underrepresented Portuguese language, we aim to examine the reliability of extractions enabled by LoRA models and glean actionable insights from the results of both LoRA and LLMs operating in ensembles. Achieving F1-scores of 88.49% for the LeNER-Br corpus and 81.00% for the UlyssesNER-Br corpus, LoRA models demonstrated competitive performance, approaching state-of-the-art standards. Our research demonstrates that incorporating class definitions and counting votes per class substantially improves LLM ensemble results. Overall, this contribution advances the frontiers of AI-powered legal text mining, proposing small models and initial prompt engineering to low-resource conditions that are scalable for broader representation.
Publicado
17/11/2024
NUNES, Rafael Oleques; PUTTLITZ, Letícia Maria; BOLL, Antonio Oss; SPRITZER, Andre; FREITAS, Carla Maria Dal Sasso; BALREIRA, Dennis Giovani; TAVARES, Anderson Rocha. An Ensemble of LLMs Finetuned with LoRA for NER in Portuguese Legal Documents. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 13. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 127-140. ISSN 2643-6264.