Legal text summarization techniques to support the classification of court decision documents

Abstract


Judgments are text documents that contain judicial decisions regarding a certain legal process. In the context of a Court of Justice, the judgments have a well-defined classification by themes, which helps jurists in the organization and agility of their daily lives. Due to the high daily volume of new judgments, it is necessary to adopt techniques capable of automating the thematic classification of a new judgment. Supervised machine learning techniques, based on classification algorithms, have not performed well in the face of extensive texts, in Portuguese, with language under the legal domain. This work proposes the adoption of judgment summaries for thematic classification. The hypothesis is that shorter, summarized texts can increase the effectiveness in classifying such documents in relation to their themes. This is a work in progress. A classification approach based on summaries will be developed. Partial results indicate that summarization algorithms perform well on judgments.

Keywords: Summarization, Legal texts, Judicial classification, Extractive summaries

References

Chen, H., Wu, L., Chen, J., Lu, W., and Ding, J. (2022). A comparative study of automated legal text classification using random forests and deep learning. Inf. Process. Manage., 59(2). https://doi.org/10.1016/j.ipm.2021.102798

Du, Y., Ma, T., Wu, L., Xu, F., Zhang, X., and Ji, S. (2021). Constructing contrastive samples via summarization for text classification with limited annotations. In EMNLP. https://doi.org/10.48550/arXiv.2104.05094

El-Kassas, W. S., Salama, C. R., Rafea, A. A., and Mohamed, H. K. (2021). Automatic text summarization: A comprehensive survey. Expert Systems with Applications, 165:113679. https://doi.org/10.1016/j.eswa.2020.113679

Jeong, H., Ko, Y., and Seo, J. (2016). How to improve text summarization and classification by mutual cooperation on an integrated framework. Expert Syst. Appl., 60(C):222–233. https://doi.org/10.1016/j.eswa.2016.05.001

Klie, J.-C., Bugert, M., Boullosa, B., de Castilho, R. E., and Gurevych, I. (2018). The inception platform: Machine-assisted and knowledge-oriented interactive annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, pages 5–9. Association for Computational Linguistics. Event Title: The 27th International Conference on Computational Linguistics (COLING 2018). https://aclanthology.org/C18-2002

Rahamat Basha, S., Keziya Rani, J., and Prasad Yadav, J. J. C. (2019). A novel summarization-based approach for feature reduction enhancing text classification accuracy. Engineering, Technology amp; Applied Science Research, 9(6):5001–5005. https://doi.org/10.48084/etasr.3173

Wang, F., Zhang, J. L., Li, Y., Deng, K., and Liu, J. S. (2021). Bayesian text classification and summarization via a class-specified topic model. J. Mach. Learn. Res., 22(1).
Published
2023-09-25
HARADA, Hellen; PEREIRA, Fabíola; ALMEIDA, Alex; FREIRE, Daniela; DIAS, Márcio; SILVA, Nádia; ANDRADE, Pedro; CARVALHO, André. Legal text summarization techniques to support the classification of court decision documents. In: BRAZILIAN SYMPOSIUM IN INFORMATION AND HUMAN LANGUAGE TECHNOLOGY (STIL), 14. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 393-397. DOI: https://doi.org/10.5753/stil.2023.234627.