Automatic identification of similar judicial precedents

  • Igor Stemler Universidade de Brasília
  • Marcelo Ladeira Universidade de Brasília
  • Thiago de Paulo Faleiros Universidade de Brasília

Resumo


Brazilian Code of Civil Procedure has been reformulated in 2015 and created new institutes of judicial precedents to allow the Courts of Appeal to decide about similar cases based on one main case, which is considered the paradigm for similar cases that remain suspended. This mechanism aims to avoid legal uncertainty in the lower courts, but, uncertainty can be taken to the Courts of Appeal, since different courts can judge similar legal matter in the opposite way. The identification of similar judicial cases is hard because Courts of Appeal work independently and the number of cases is high. We propose the use of computational intelligence techniques to automatically identify similar judicial precedents. Our hypothesis is that algorithms based on semantic approaches, such as Latent Semantic Indexing and Latent Dirichlet Allocation, perform better than those that use only syntactic approach, as (Okapi) BM25 ranking function. The best-performing model is extended with named entities to verify if its performance increases. The performance of the models is evaluated using similarity metrics and with the assistance of a specialist. We test this approach with the database of judicial precedent of the National Council of Justice. Our approach correctly grouped more than 90% of judicial precedents and found similar precedents with divergent decisions or precedents that should be suspended due to the existence of appeals into superior courts of same subject matter. Models based on syntactic approach presented the best results, as it required lower computational cost and fewer parameter tuning compared to the others.
Palavras-chave: Judicial Precedents, Class Action, BM25, LSI, LDA

Referências

Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E. D., Gutierrez, J. B., and Kochut, K. A brief survey of text mining: Classification, clustering and extraction techniques. arXiv preprint arXiv:1707.02919 , 2017.

Antonio Pereira Gaio Júnior. Incidente de Resolução de Demandas Repetitivas no projeto do novo CPC: Breves apontamentos. Legis Augustus 4 (2): 1–11, 2013.

Artur César de Souza. Resolução de demandas repetitivas: Comunicação de demanda individual incidente de resolução de demandas repetitivas recursos repetitivos. Novo Processo Civil Brasileiro. Almedina Brasil, Brasil, 2015.

Benoit, K., Watanabe, K., Wang, H., Nulty, P., Obeng, A., Müller, S., and Matsuo, A. Quanteda: An r-package for the quantitative analysis of textual data software. J. Open Source Software 3 (30), 2018.

Blei, D. M., Ng, A. Y., and Jordan, M. I. Latent dirichlet allocation. Journal of machine Learning research vol. 3, pp. 993–1022, 2003.

Chandrasekaran, D. and Mago, V. Evolution of semanticsimilarity – a survey. ACM Computing Surveys vol. 54, pp. 1–37, 2021.

CNJ. Estudo comparado sobre recursos, litigiosidade e produtividade: a prestação jurisdicional no contexto internacional. Tech. rep., Conselho Nacional de Justiça, 2011.

CNJ. Justiça em Números 2022. Tech. rep., Conselho Nacional de Justiça, 2022.

de Jesus Silva, J. O Incidente de Resolução de Demandas Repetitivas no projeto do novo Código de Processo Civil: Segurança Jurídica e legitimidade democrática das decisões judiciais no Estado Constitucional de direito. Ph.D. thesis, Universidade de Brasília, 2014.

Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R. Indexing by latent semantic analysis. Journal of the American society for information science vol. 41, pp. 391, 1990.

Durço, K. A. O Incidente de Resolução de Demandas Repetitivas: Uma das propostas centrais do projeto de novo Código de Processo Civil. Revista Eletronica de Direito Processual 8 (8), 2016.

Hajjem, M. and Latiri, C. Combining IR and LDA Topic Modeling for Filtering Microblogs. Procedia Computer Science vol. 112, pp. 761–770, 2017.

Hotho, A., Nürnberger, A., and Paaß, G. A brief survey of text mining. Journal for Language Technology and Computational Linguistics vol. 20, pp. 19–62, 2005.

Manning, C. D., Raghavan, P., and Shutze, H. Introduction to information retrieval. Cambridge University Press, 2008.

Melo, A. and Br, M. A. Revisão da Literatura: Apresentação de uma Abordagem Integradora. AEDEM International Conference, 2017.

Meyer, D., Hornik, K., and Feinerer, I. Text mining infrastructure in r. Journal of statistical software 25 (5):1–54, 2008.

Osei-Bryson, K.-M. Towards supporting expert evaluation of clustering results using a data mining process model. Information Sciences 180 (3): 414–431, 2010.

Sinoara, R. A., Antunes, J., and Rezende, S. O. Text mining and semantics: a systematic mapping study Journal of the Brazilian Computer Society Text mining and semantics: a systematic mapping study. Journal of the Brazilian Computer Society vol. 23, 2017.

Sinoara, R. A., Antunes, J., and Rezende, S. O. Knowledge-enhanced document embeddings for text classification. Knowledge-Based Systems vol. 163, pp. 955–971, 2019.

UNDP and CNJ. Justice 4.0 program - understand how the brazilian justice system is using digital transformation to promote innovation, become more efficient and increase access to justice for all. Tech. rep., United Nations development Programme – Brazil and Conselho Nacional de Justiça, 2022.
Publicado
28/11/2022
STEMLER, Igor; LADEIRA, Marcelo; FALEIROS, Thiago de Paulo. Automatic identification of similar judicial precedents. In: SYMPOSIUM ON KNOWLEDGE DISCOVERY, MINING AND LEARNING (KDMILE), 10. , 2022, Campinas/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 25-33. ISSN 2763-8944. DOI: https://doi.org/10.5753/kdmile.2022.227943.