Machine Learning Algorithms Applied on Classification of Processes for Conciliation on Brazilian Labour Judiciary

  • Filipe M. C. Barros Universidade Federal do Pará
  • Cleison D. Silva Universidade Federal do Pará
  • Igor R. M. Silva Universidade Federal do Rio Grande do Norte
  • Victor S. Martins Universidade Federal do Pará
  • Antonio J. S. Araújo Universidade Federal do Pará

Resumo


The Labour Judiciary ensures protection and justice in labour relations, resolving conflicts such as unfair dismissals and wage delays. Artificial intelligence emerges to expedite legal activities, assisting in dealing with the increasing case load in the Judiciary over the past years. In labor dispute resolution, conciliation is a recommended solution, offering speed and cost reduction. In this sense, this study proposes to evaluate models to predict the success of labor cases being resolved through conciliation. The dataset used to generate the models considered in this study consists of initial petitions from cases extracted from the Processo Judiciário Eletrônico (PJe) maintained by the Tribunal Regional do Trabalho da 8ª Região. Pre-processing steps were performed on these documents, including the removal of accents, special characters, numerals, punctuation, stopwords, conversion of text to lowercase, stemming, and tokenization. The next step was text vectorization using the Term Frequency-Inverse Document Frequency (TF-IDF) for model generation. For our analysis, three machine learning algorithms were taken into account: Support Vector Machines (SVM), logistic regression, and decision trees. Additionally, a boosted tree model (XGBoost) was also generated. Based on the analysis conducted, the SVM with RBF kernel yielded better results, achieving an accuracy of 83% and an F1-Score of 82%, with a Matthews Correlation Coefficient (MCC) of 0.66 and an Area Under the ROC Curve (AUC) of 0.83.

Palavras-chave: Labour Justice, Conciliation, Term Frequency-Inverse Document Frequency, Support Vector Machines, Logistic Regression, Decision Tree

Referências

Apache Tika, biblioteca de análise de texto Apache Tika. Disponível em: https://tika.apache.org/. Acesso em 23 de mar. de 2023.

Bird, S.; Loper, E.; Klein, E. Natural Language Processing with Python. [S.l.]: O’Reilly Media Inc, 2009. ISBN 0596516495.

CEJUSC. Centro Judiciário de Solução de Conflitos e Cidadania. CEJUSC, 2022. Disponível em https://www.trt8.jus.br/cejusc. Acesso em 21 de mar. de 2022.

Davis, J., Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. Proceedings of the 23rd international conference on Machine learning, 233-240.

De Oliveira, Raphael Souza, and Nascimento, Erick Giovani Sperandio. "Brazilian Court Documents Clustered by Similarity Together Using Natural Language Processing Approaches with Transformers." arXiv preprint arXiv:2204.07182 (2022)

Horus, Hórus (Inteligência do negócio), Sistemas do Tribunal Regional do Trabalho 8ª Região, 2022. Disponível em: <https://www.trt8.jus.br/servicos> <https://horus.trt8.jus.br/index.htm>

Hsieh, Hsun-Ping, et al. "Predicting the Success of Mediation Requests Using Case Properties and Textual Information for Reducing the Burden on the Court." Digital Government: Research and Practice 2.4 (2022): 1-18

Html2text, ferramenta para converter um documento HTML em texto. Disponível em: https://github.com/grobian/html2text. Acesso em 23 de mar. de 2023.

Jurafsky, D., Martin, J. H. (2020). Speech and language processing. An introduction to natural language processing, computational linguistics, and speech recognition. Pearson.

Noguti, Mariana Y., Eduardo Vellasques, and Luiz S. Oliveira. "Legal document classification: An application to law area prediction of petitions to public prosecution service." 2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020.

Orengo, V.; Huyck, C. A stemming algorithmm for the portuguese language. IEEE (em inglês): 186–193. Novembro de 2001. doi:10.1109/SPIRE.2001.989755

PJE, Processo Judicial Eletrônico, Sistemas do Tribunal Regional do Trabalho 8ª Região, 2023. Disponível em: https://www.trt8.jus.br/servicos https://www.trt8.jus.br/pje

Powers, D. M. (2011). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37-63.

Sklearn TfidfVectorizer, biblioteca python para converter textos em matriz TF-IDF. Disponível em: [link]. Acesso em 13 de ago. de 2023.

TST. Tribunal Superior do Trabalho. Matérias Temáticas Conciliação. TST, 2021. Disponível em http://www.tst.jus.br/web/guest/conciliacao. Acesso em 23 de nov. de 2021.
Publicado
25/09/2023
BARROS, Filipe M. C.; SILVA, Cleison D.; SILVA, Igor R. M.; MARTINS, Victor S.; ARAÚJO, Antonio J. S.. Machine Learning Algorithms Applied on Classification of Processes for Conciliation on Brazilian Labour Judiciary. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 20. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 389-402. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2023.234189.