No Argument Left Behind: Overlapping Chunks for Faster Processing of Arbitrarily Long Legal Texts
Resumo
In a context where the Brazilian judiciary system, the largest in the world, faces a crisis due to the slow processing of millions of cases, it becomes imperative to develop efficient methods for analyzing legal texts. We introduce uBERT, a hybrid model that combines Transformer and Recurrent Neural Network architectures to effectively handle long legal texts. Our approach processes the full text regardless of its length while maintaining reasonable computational overhead. Our experiments demonstrate that uBERT achieves superior performance compared to BERT+LSTM when overlapping input is used and is significantly faster than ULMFiT for processing long legal documents.
Referências
CNJ (2024). Conselho nacional de justiça - cnj. Accessed: 2024-08-05.
Cui, J., Shen, X., Nie, F., Wang, Z., Wang, J., and Chen, Y. (2022). A survey on legal judgment prediction: Datasets, metrics, models and challenges. arXiv preprint arXiv:2204.04859v1.
Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7:1–30.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805v2. DOI: 10.18653/v1/N19-1423
Ferraz, T. P., Alcoforado, A., Bustos, E., Oliveira, A. S., Gerber, R., Müller, N., d’Almeida, A. C., Veloso, B. M., and Costa, A. H. R. (2021). Debacer: a method for slicing moderated debates. In Anais do XVIII Encontro Nacional de Inteligência Artificial e Computacional, pages 667–678. Sociedade Brasileira de Computacao-SBC
Guu, K., Lee, K., Tung, Z., Pasupat, P., and Chang, M.-W. (2020). Realm: Retrieval-augmented language model pre-training.
Hasan, M. (2022). Transformers in natural language processing. DOI: 10.13140/RG.2.2.18062.84809
Hoang, T. D., Bui, C. M., and Bui, N. (2023). Viettel-AI at SemEval-2023 task 6: Legal document understanding with longformer for court judgment prediction with explanation. In Ojha, A. K., Doǧruöz, A. S., Da San Martino, G., Tayyar Madabushi, H., Kumar, R., and Sartori, E., editors, Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023), pages 862–868, Toronto, Canada. Association for Computational Linguistics. DOI: 10.18653/v1/2023.semeval-1.119
Howard, J. and Ruder, S. (2018). Fine-tuned language models for text classification. CoRR, abs/1801.06146. DOI: 10.18653/v1/P18-1031
Katz, D., Hartung, D., Gerlach, L., Jana, A., and Bommarito, M. (2023). Natural language processing in the legal domain. arXiv preprint arXiv:2302.12039v1. DOI: 10.2139/ssrn.4336224
Li, Q., Peng, H., Li, J., Xia, C., Yang, R., Sun, L., Yu, P. S., and He, L. (2022). A survey on text classification: From traditional to deep learning. ACM Trans. Intell. Syst. Technol., 13(2). DOI: 10.1145/3495162
Mahari, R., Stammbach, D., Ash, E., and Pentland, A. (2023). The law and NLP: Bridging disciplinary disconnects. In Bouamor, H., Pino, J., and Bali, K., editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3445–3454, Singapore. Association for Computational Linguistics DOI: 10.18653/v1/2023.findings-emnlp.224
Malik, V., Sanjay, R., Nigam, S. K., Ghosh, K., Guha, S. K., Bhattacharya, A., and Modi, A. (2021). ILDC for CJPE: Indian legal documents corpus for court judgment prediction and explanation. In Zong, C., Xia, F., Li, W., and Navigli, R., editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4046–4062, Online. Association for Computational Linguistics DOI: 10.18653/v1/2021.acl-long.313
Medvedeva, M. and Mcbride, P. (2023). Legal judgment prediction: If you are going to do it, do it right. In Preot, iuc-Pietro, D., Goanta, C., Chalkidis, I., Barrett, L., Spanakis, G., and Aletras, N., editors, Proceedings of the Natural Legal Language Processing Workshop 2023, pages 73–84, Singapore. Association for Computational Linguistics. DOI: 10.18653/v1/2023.nllp-1.9
Menezes-Neto, E. J. d. and Clementino, M. B. M. (2022). Using deep learning to predict outcomes of legal appeals better than human experts: A study with data from brazilian federal courts. PLOS ONE, 17(7):1–20. DOI: 10.1371/journal.pone.0272287
Pappagari, R., Zelasko, P., Villalba, J., Carmiel, Y., and Dehak, N. (2019). Hierarchical transformers for long document classification DOI: 10.1109/ASRU46091.2019.9003958
Tenney, I., Das, D., and Pavlick, E. (2019). Bert rediscovers the classical nlp pipeline. DOI: 10.18653/v1/P19-1452
Wan, L., Seddon, M., Papageorgiou, G., and Bernardoni, M. (2019). Long-length legal document classification. arXiv preprint arXiv:1912.06905v1 DOI: 10.13140/RG.2.2.36657.12646
Zaheer, M., Guruganesh, G., Dubey, A., Ainslie, J., Alberti, C., Onta˜nón, S., Pham, P., Ravula, A., Wang, Q., Yang, L., and Ahmed, A. (2020). Big bird: Transformers for longer sequences. CoRR, abs/2007.14062
Zhang, L., Wang, W., Yu, K., huang, J., Lyu, Q., Xue, H., and Hetang, C. (2023). Sliding-bert: Striding towards conversational machine comprehension in long contex. Adv. Artif. Intell. Mach. Learn., 3:1325–1339
Zhu, H., Mak, D., Gioannini, J., and Xia, F. (2020). NLPStatTest: A toolkit for comparing NLP system performance. In Wong, D. and Kiela, D., editors, Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: System Demonstrations, pages 40–46, Suzhou, China. Association for Computational Linguistics