Legal text summarization techniques to support the classification of court decision documents
Abstract
Judgments are text documents that contain judicial decisions regarding a certain legal process. In the context of a Court of Justice, the judgments have a well-defined classification by themes, which helps jurists in the organization and agility of their daily lives. Due to the high daily volume of new judgments, it is necessary to adopt techniques capable of automating the thematic classification of a new judgment. Supervised machine learning techniques, based on classification algorithms, have not performed well in the face of extensive texts, in Portuguese, with language under the legal domain. This work proposes the adoption of judgment summaries for thematic classification. The hypothesis is that shorter, summarized texts can increase the effectiveness in classifying such documents in relation to their themes. This is a work in progress. A classification approach based on summaries will be developed. Partial results indicate that summarization algorithms perform well on judgments.
References
Du, Y., Ma, T., Wu, L., Xu, F., Zhang, X., and Ji, S. (2021). Constructing contrastive samples via summarization for text classification with limited annotations. In EMNLP. https://doi.org/10.48550/arXiv.2104.05094
El-Kassas, W. S., Salama, C. R., Rafea, A. A., and Mohamed, H. K. (2021). Automatic text summarization: A comprehensive survey. Expert Systems with Applications, 165:113679. https://doi.org/10.1016/j.eswa.2020.113679
Jeong, H., Ko, Y., and Seo, J. (2016). How to improve text summarization and classification by mutual cooperation on an integrated framework. Expert Syst. Appl., 60(C):222–233. https://doi.org/10.1016/j.eswa.2016.05.001
Klie, J.-C., Bugert, M., Boullosa, B., de Castilho, R. E., and Gurevych, I. (2018). The inception platform: Machine-assisted and knowledge-oriented interactive annotation. In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, pages 5–9. Association for Computational Linguistics. Event Title: The 27th International Conference on Computational Linguistics (COLING 2018). https://aclanthology.org/C18-2002
Rahamat Basha, S., Keziya Rani, J., and Prasad Yadav, J. J. C. (2019). A novel summarization-based approach for feature reduction enhancing text classification accuracy. Engineering, Technology amp; Applied Science Research, 9(6):5001–5005. https://doi.org/10.48084/etasr.3173
Wang, F., Zhang, J. L., Li, Y., Deng, K., and Liu, J. S. (2021). Bayesian text classification and summarization via a class-specified topic model. J. Mach. Learn. Res., 22(1).
