Reconhecimento de entidades nomeadas em documentos jurídicos em português utilizando redes neurais

Caio C. R. Mota; André C. A. Nascimento; Péricles B. C. Miranda; Rafael Ferreira Mello; Isabel W. S. Maldonado; José L. M. Coelho Filho

doi:10.5753/eniac.2021.18247

Caio C. R. Mota UFRPE
André C. A. Nascimento UFRPE
Péricles B. C. Miranda UFRPE
Rafael Ferreira Mello UFRPE
Isabel W. S. Maldonado NESS Law
José L. M. Coelho Filho NESS Law

DOI: https://doi.org/10.5753/eniac.2021.18247

Resumo

Ao longo dos últimos anos, a tecnologia da informação vem transformando o mundo jurídico, automatizando processos e, por consequência, diminuindo o tempo necessário para criação e análise de peças jurídicas digitais. Um dos problemas mais estudados nesta área é o reconhecimento de entidades nomeadas (REN) em textos não estruturados. Trabalhos anteriores não abordaram a detecção de entidades legais por meio da aplicação de modelos baseados em redes neurais disponíveis em bibliotecas de processamento de linguagens natural. Neste artigo, o uso de das bibliotecas Spacy e FLAIR foram analisados no contexto de REN em petições iniciais. Os modelos foram treinados com arquiteturas pré-definidas e avaliados em dois corpora, um deles desenvolvido no âmbito deste trabalho. Os resultados obtidos com esses experimentos demonstraram bons resultados com ambas as plataformas Spacy e FLAIR, com desempenho superior quando adotado o BiLSTM-CRF com FLAIR embeddings.

Referências

Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., and Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems. Software available from tensorflow.org.

Akbik, A., Bergmann, T., Blythe, D., Rasul, K., Schweter, S., and Vollgraf, R. (2019). Flair: An easy-to-use framework for state-of-the-art nlp. In NAACL 2019, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), pages 54–59.

Akbik, A., Blythe, D., and Vollgraf, R. (2018). Contextual string embeddings for sequence labeling. In COLING 2018, 27th International Conference on Computational Linguistics, pages 1638–1649.

Alom, M. Z., Taha, T., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M., Hasan, M., Essen, B., Awwal, A., and Asari, V. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, 8:292.

Cardellino, C., Alemany, L. A., Teruel, M., and Villata, S. (2017). A low-cost, high-coverage legal named entity recognizer, classifier and linker. Proceedings of the International Conference on Artificial Intelligence and Law, pages 9–18.

Chen, J., Huang, Y., Yang, F., and Li, C. (2020). A novel named entity recognition approach of judicial case texts based on BiLSTM-CRF. 12th International Conference on Advanced Computational Intelligence, ICACI 2020, pages 263–268.

Costa, C. M., Veiga, G., Sousa, A., and Nunes, S. (2017). Evaluation of stanford ner for extraction of assembly information from instruction manuals. In 2017 ieee international conference on autonomous robot systems and competitions (icarsc), pages 302–309. IEEE.

Dale, R. (2019). Law and word order: Nlp in legal tech. Natural Language Engineering, 25(1):211–217.

D’Angelo (2019). Em dois anos, número de startups jurídicas cresce 300% no brasil. Disponível em: [link]. Acesso em: 30 maio 2021.

Honnibal, M., Montani, I., Van Landeghem, S., and Boyd, A. (2020). spaCy: Industrial-strength Natural Language Processing in Python.

Ji, B., Liu, R., Li, S., Yu, J., Wu, Q., Tan, Y., and Wu, J. (2019). A hybrid approach for named entity recognition in chinese electronic medical record. BMC Medical Informatics and Decision Making, 19.

Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., and Dyer, C. (2016). Neural architectures for named entity recognition. 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 Proceedings of the Conference, pages 260–270.

Luz de Araujo, P. H., de Campos, T. E., de Oliveira, R. R. R., Stauffer, M., Couto, S., and Bermejo, P. (2018). LeNER-Br: a dataset for named entity recognition in Brazilian legal text. In International Conference on the Computational Processing of Portuguese (PROPOR), Lecture Notes on Computer Science (LNCS), pages 313–323, Canela, RS, Brazil. Springer.

Minkov, E., Wang, R. C., and Cohen, W. (2005). Extracting personal names from email: Applying named entity recognition to informal text. In Proceedings of human language technology conference and conference on empirical methods in natural language processing, pages 443–450.

Panchendrarajan, R. and Amaresan, A. (2018). Bidirectional lstm-crf for named entity recognition. In PACLIC.

Son, N. T., Nguyen, L. M., Quoc, H. B., and Shimazu, A. (2016). Recognizing logical parts in legal texts using neural architectures. Proceedings 2016 8th International Conference on Knowledge and Systems Engineering, KSE 2016, pages 252–257.

Storks, S., Gao, Q., and Chai, J. Y. (2019). Recent advances in natural language inference: A survey of benchmarks, resources, and approaches. arXiv preprint arXiv:1904.01172.

Yadav, V. and Bethard, S. (2019). A Survey on Recent Advances in Named Entity Recognition from Deep Learning models. arXiv.

Yin, W., Kann, K., Yu, M., and Schütze, H. (2017). Comparative study of cnn and rnn for natural language processing. arXiv preprint arXiv:1702.01923.

Reconhecimento de entidades nomeadas em documentos jurídicos em português utilizando redes neurais

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)