Lexical noun phrase chunking with Universal Dependencies for Portuguese


Partial parsing retrieves a limited amount of syntactic information from a sentence. This project describes the identification of a specific type of noun phrase, through partial syntactic analysis, defined as a lexical noun phrase (NPL), in texts written in Brazilian Portuguese, and annotated according to the Universal Dependency (UD) formalism. The Transformation Based Learning algorithm, TBL–Brill, applied as baseline, obtained an accuracy of 87.42% considering the UD dependency relations and 91.44% considering the UD morphosyntactic tags. Two other classifiers, one based on binary trees and the other based on a decision forest, had inferior performance.

Palavras-chave: Lexical noun phrase, shallow parsing, Universal Dependencies


DE SOUZA, Aleksander Tomaz; RUIZ, Evandro Eduardo Seron. Lexical noun phrase chunking with Universal Dependencies for Portuguese. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 14. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 414-423. DOI: https://doi.org/10.5753/stil.2023.25482.