Reconhecimento de entidades nomeadas em documentos jurídicos em português utilizando redes neurais

  • Caio C. R. Mota UFRPE
  • André C. A. Nascimento UFRPE
  • Péricles B. C. Miranda UFRPE
  • Rafael Ferreira Mello UFRPE
  • Isabel W. S. Maldonado NESS Law
  • José L. M. Coelho Filho NESS Law

Abstract


Over the past few years, information technology has been transforming the legal world, automating processes and, consequently, reducing the time needed to create and analyse digital legal documents. One of the most studied problems in this area is the recognition of named entities (NER) in unstructured texts. Previous work has not addressed the detection of legal entities through the application of models based on neural networks available in natural language processing libraries. In this article, the use of the libraries Spacy and FLAIR were analyzed in the context of NER in initial petitions. The models were trained with pre-defined architectures and evaluated in two corpora, one of them developed in the scope of this work. The results obtained with these experiments demonstrated good results with both platforms Spacy and FLAIR, with superior performance when adopting BiLSTM-CRF with FLAIR embeddings.

References

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.

Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., and Vollgraf, R. (2019). Flair: An easy-to-use framework for state-of-the-art nlp. In NAACL 2019, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pages 54–59.

Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contextual string embeddings for sequence labeling. In COLING 2018, 27th International Conference on Computational Linguistics, pages 1638–1649.

Alom, M. Z., Taha, T., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M., Hasan, M., Essen, B., Awwal, A., and Asari, V. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, 8:292.

Cardellino, C., Alemany, L. A., Teruel, M., and Villata, S. (2017). A low-cost, high-coverage legal named entity recognizer, classifier and linker. Proceedings of the International Conference on Artificial Intelligence and Law, pages 9–18.

Chen, J., Huang, Y., Yang, F., and Li, C. (2020). A novel named entity recognition approach of judicial case texts based on BiLSTM-CRF. 12th International Conference on Advanced Computational Intelligence, ICACI 2020, pages 263–268.

Costa, C. M., Veiga, G., Sousa, A., and Nunes, S. (2017). Evaluation of stanford ner for extraction of assembly information from instruction manuals. In 2017 ieee international conference on autonomous robot systems and competitions (icarsc), pages 302–309. IEEE.

Dale, R. (2019). Law and word order: Nlp in legal tech. Natural Language Engineering, 25(1):211–217.

D’Angelo (2019). Em dois anos, número de startups jurídicas cresce 300% no brasil. Disponível em: [link]. Acesso em: 30 maio 2021.

Honnibal, M., Montani, I., Van Landeghem, S., and Boyd, A. (2020). spaCy: Industrial-strength Natural Language Processing in Python.

Ji, B., Liu, R., Li, S., Yu, J., Wu, Q., Tan, Y., and Wu, J. (2019). A hybrid approach for named entity recognition in chinese electronic medical record. BMC Medical Informatics and Decision Making, 19.

Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., and Dyer, C. (2016). Neural architectures for named entity recognition. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 Proceedings of the Conference, pages 260–270.

Luz de Araujo, P. H., de Campos, T. E., de Oliveira, R. R. R., Stauffer, M., Couto, S., and Bermejo, P. (2018). LeNER-Br: a dataset for named entity recognition in Brazilian legal text. In International Conference on the Computational Processing of Portuguese (PROPOR), Lecture Notes on Computer Science (LNCS), pages 313–323, Canela, RS, Brazil. Springer.

Minkov, E., Wang, R. C., and Cohen, W. (2005). Extracting personal names from email: Applying named entity recognition to informal text. In Proceedings of human language technology conference and conference on empirical methods in natural language processing, pages 443–450.

Panchendrarajan, R. and Amaresan, A. (2018). Bidirectional lstm-crf for named entity recognition. In PACLIC.

Son, N. T., Nguyen, L. M., Quoc, H. B., and Shimazu, A. (2016). Recognizing logical parts in legal texts using neural architectures. Proceedings 2016 8th International Conference on Knowledge and Systems Engineering, KSE 2016, pages 252–257.

Storks, S., Gao, Q., and Chai, J. Y. (2019). Recent advances in natural language inference: A survey of benchmarks, resources, and approaches. arXiv preprint arXiv:1904.01172.

Yadav, V. and Bethard, S. (2019). A Survey on Recent Advances in Named Entity Recognition from Deep Learning models. arXiv.

Yin, W., Kann, K., Yu, M., and Schütze, H. (2017). Comparative study of cnn and rnn for natural language processing. arXiv preprint arXiv:1702.01923.
Published
2021-11-29
MOTA, Caio C. R.; NASCIMENTO, André C. A.; MIRANDA, Péricles B. C.; MELLO, Rafael Ferreira; MALDONADO, Isabel W. S.; COELHO FILHO, José L. M.. Reconhecimento de entidades nomeadas em documentos jurídicos em português utilizando redes neurais. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 18. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 130-140. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2021.18247.

Most read articles by the same author(s)

1 2 3 > >>