Question Answering Techniques for Portuguese Legal Documents: A Systematic Literature Review
Abstract
The exponential growth of Portuguese-language legal documents has renewed interest in Question Answering (QA) systems capable of returning concise, legally sound answers to natural-language queries. This study presents a systematic literature review, conducted according to PRISMA 2020 guidelines, that synthesises current evidence on QA techniques applied to Lusophone legal texts. Searches, without temporal restrictions, were executed in nine databases (ACM, El Compendex, ISI Web of Science, Periódico Capes, Scielo, Science@Direct, Scopus, Sol SBC and Springer Link) using a string that combine jurisprudential, linguistic and methodological terms. After duplicate removal, independent screening and quality appraisal, ten primary studies met the inclusion criteria (peer-reviewed publications developing or evaluating QA pipelines over Brazilian or Portuguese legislation). Publication activity is recent: more than 70% of the papers appeared between 2023 and 2025 and focus on Brazilian statutes and court decisions. Most pipelines adopt hybrid retrieval—BM25 or symbolic regex filters coupled with BERT-family dense encoders fine-tuned on legal corpora, while Retrieval-Augmented Generation with GPT-class models emerges in the latest research. Reported exact-match scores range from 0.60 to 0.83 and F1 from 0.75 to 0.87; however, only a quarter of the studies release code or data, hindering reproducibility. Common gaps include limited handling of the temporal validity of norms, scarce evaluation by legal specialists, and the absence of benchmark datasets for Portuguese. Overall, QA research for Lusophone law is accelerating yet remains fragmented; future work should prioritize shared resources, temporally aware models, and metrics that capture legal soundness beyond lexical overlap.References
Athaydes, A., Bulcao, L., Sacramento, C., Mane, B., Claro, D., Souza, M., and Pita, R. (2024). Brazilian consumer protection code: a methodology for a dataset to question-answer (qa) models. In Anais do XV Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana, pages 493–500, Porto Alegre, RS, Brasil. SBC.
Barcellos, R., Bernardini, F., and Viterbo, J. (2020). A methodology for retrieving datasets from open government data portals using information retrieval and question and answering techniques. In Viale Pereira, G., Janssen, M., Lee, H., Lindgren, I., Rodríguez Bolívar, M. P., Scholl, H. J., and Zuiderwijk, A., editors, Electronic Government, pages 239–249, Cham. Springer International Publishing.
Barros, T. S., Pires, C. E. S., and Nascimento, D. C. (2023). Leveraging bert for extractive text summarization on federal police documents. Knowledge and Information Systems, 65(11):4873–4903.
Bertalan, V. G. F. and Ruiz, E. E. S. (2024). Using attention methods to predict judicial outcomes. Artificial Intelligence and Law, 32(1):87–115.
Costa, Y. D. R., Oliveira, H., Nogueira, V., Massa, L., Yang, X., Barbosa, A., Oliveira, K., and Vieira, T. (2025). Automating petition classification in brazil’s legal system: a two-step deep learning approach. Artificial Intelligence and Law, 33(1):227–251.
de Vargas Feijó, D. and Moreira, V. P. (2018). Rulingbr: A summarization dataset for legal texts. In Villavicencio, A., Moreira, V., Abad, A., Caseli, H., Gamallo, P., Ramisch, C., Gonçalo Oliveira, H., and Paetzold, G. H., editors, Computational Processing of the Portuguese Language, pages 255–264, Cham. Springer International Publishing.
Ferneda, E., do Prado, H. A., Batista, A. H., and Pinheiro, M. S. (2012). Extracting definitions from brazilian legal texts. In Murgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A. M. A. C., Taniar, D., and Apduhan, B. O., editors, Computational Science and Its Applications – ICCSA 2012, pages 631–646, Berlin, Heidelberg. Springer Berlin Heidelberg.
Filho, A. D. A. (2019). Do casamento às uniões sem selo: O alcance social e jurídico dos arranjos familiares no brasil e em portugal. Revista Jurídica Portucalense.
Jerónimo, P. (2025). Legal translation and the challenges of overcoming language barriers in court practice: Evidence from portuguese courts. International Journal for the Semiotics of Law. Advance online publication.
Martinez-Gil, J. (2023). A survey on legal question–answering systems. Computer Science Review, 48:100552.
Nunes, R. O., Santos, J., Spritzer, A., Balreira, D. G., Freitas, C. M. D. S., Olival, F., Cameron, H. F., and Vieira, R. (2025). Assessing european and brazilian portuguese llms for ner in specialised domains. In Paes, A. and Verri, F. A. N., editors, Intelligent Systems, pages 215–230, Cham. Springer Nature Switzerland.
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., and Moher, M. J. (2021). The prisma 2020 statement: An updated guideline for reporting systematic reviews. PLOS Medicine, 18(3).
Perlingeiro, R. and Ghio, E. (2020). Princípios Gerais da Cooperação Jurídica Internacional: uma abordagem temática e comparativa. Núcleo de Pesquisa e Extensão sobre Ciências do Poder Judiciário (Nupej), Niterói, Brasil, 1 edition.
Ramos, M. (2017). Governo das sociedades e responsabilidade civil dos administradores: Algumas reflexões a partir da experiência jurídica portuguesa. Social Science Research Network.
Ransolin, M. and Baruffi, P. (2022). O direito ao esquecimento: A exclusão de notícias que ferem a integridade e intimidade da pessoa e a atual discussão do supremo tribunal federal. Ponto de Vista Jurídico.
Sakiyama, K., Montanari, R., Malaquias Junior, R., Nogueira, R., and Romero, R. A. F. (2023). Exploring text decoding methods for portuguese legal text generation. In Naldi, M. C. and Bianchi, R. A. C., editors, Intelligent Systems, pages 63–77, Cham. Springer Nature Switzerland.
Viegas, C. F. O., Costa, B. C., and Ishii, R. P. (2023). Jurisbert: A new approach that converts a classification corpus into an sts one. In Gervasi, O., Murgante, B., Taniar, D., Apduhan, B. O., Braga, A. C., Garau, C., and Stratigea, A., editors, Computational Science and Its Applications – ICCSA 2023, pages 349–365, Cham. Springer Nature Switzerland.
Barcellos, R., Bernardini, F., and Viterbo, J. (2020). A methodology for retrieving datasets from open government data portals using information retrieval and question and answering techniques. In Viale Pereira, G., Janssen, M., Lee, H., Lindgren, I., Rodríguez Bolívar, M. P., Scholl, H. J., and Zuiderwijk, A., editors, Electronic Government, pages 239–249, Cham. Springer International Publishing.
Barros, T. S., Pires, C. E. S., and Nascimento, D. C. (2023). Leveraging bert for extractive text summarization on federal police documents. Knowledge and Information Systems, 65(11):4873–4903.
Bertalan, V. G. F. and Ruiz, E. E. S. (2024). Using attention methods to predict judicial outcomes. Artificial Intelligence and Law, 32(1):87–115.
Costa, Y. D. R., Oliveira, H., Nogueira, V., Massa, L., Yang, X., Barbosa, A., Oliveira, K., and Vieira, T. (2025). Automating petition classification in brazil’s legal system: a two-step deep learning approach. Artificial Intelligence and Law, 33(1):227–251.
de Vargas Feijó, D. and Moreira, V. P. (2018). Rulingbr: A summarization dataset for legal texts. In Villavicencio, A., Moreira, V., Abad, A., Caseli, H., Gamallo, P., Ramisch, C., Gonçalo Oliveira, H., and Paetzold, G. H., editors, Computational Processing of the Portuguese Language, pages 255–264, Cham. Springer International Publishing.
Ferneda, E., do Prado, H. A., Batista, A. H., and Pinheiro, M. S. (2012). Extracting definitions from brazilian legal texts. In Murgante, B., Gervasi, O., Misra, S., Nedjah, N., Rocha, A. M. A. C., Taniar, D., and Apduhan, B. O., editors, Computational Science and Its Applications – ICCSA 2012, pages 631–646, Berlin, Heidelberg. Springer Berlin Heidelberg.
Filho, A. D. A. (2019). Do casamento às uniões sem selo: O alcance social e jurídico dos arranjos familiares no brasil e em portugal. Revista Jurídica Portucalense.
Jerónimo, P. (2025). Legal translation and the challenges of overcoming language barriers in court practice: Evidence from portuguese courts. International Journal for the Semiotics of Law. Advance online publication.
Martinez-Gil, J. (2023). A survey on legal question–answering systems. Computer Science Review, 48:100552.
Nunes, R. O., Santos, J., Spritzer, A., Balreira, D. G., Freitas, C. M. D. S., Olival, F., Cameron, H. F., and Vieira, R. (2025). Assessing european and brazilian portuguese llms for ner in specialised domains. In Paes, A. and Verri, F. A. N., editors, Intelligent Systems, pages 215–230, Cham. Springer Nature Switzerland.
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., and Moher, M. J. (2021). The prisma 2020 statement: An updated guideline for reporting systematic reviews. PLOS Medicine, 18(3).
Perlingeiro, R. and Ghio, E. (2020). Princípios Gerais da Cooperação Jurídica Internacional: uma abordagem temática e comparativa. Núcleo de Pesquisa e Extensão sobre Ciências do Poder Judiciário (Nupej), Niterói, Brasil, 1 edition.
Ramos, M. (2017). Governo das sociedades e responsabilidade civil dos administradores: Algumas reflexões a partir da experiência jurídica portuguesa. Social Science Research Network.
Ransolin, M. and Baruffi, P. (2022). O direito ao esquecimento: A exclusão de notícias que ferem a integridade e intimidade da pessoa e a atual discussão do supremo tribunal federal. Ponto de Vista Jurídico.
Sakiyama, K., Montanari, R., Malaquias Junior, R., Nogueira, R., and Romero, R. A. F. (2023). Exploring text decoding methods for portuguese legal text generation. In Naldi, M. C. and Bianchi, R. A. C., editors, Intelligent Systems, pages 63–77, Cham. Springer Nature Switzerland.
Viegas, C. F. O., Costa, B. C., and Ishii, R. P. (2023). Jurisbert: A new approach that converts a classification corpus into an sts one. In Gervasi, O., Murgante, B., Taniar, D., Apduhan, B. O., Braga, A. C., Garau, C., and Stratigea, A., editors, Computational Science and Its Applications – ICCSA 2023, pages 349–365, Cham. Springer Nature Switzerland.
Published
2025-09-29
How to Cite
LIMA, Maurício Rodrigues; TELES, Vinícius; TELES, Sávio; DIAS, Elisângela Silva.
Question Answering Techniques for Portuguese Legal Documents: A Systematic Literature Review. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 783-794.
ISSN 2763-9061.
DOI: https://doi.org/10.5753/eniac.2025.14205.
