Portuguese Neural Text Simplification Using Machine Translation
Resumo
Automatic Text Simplification (ATS) has played a significant role in the Natural Language Processing (NLP) field. ATS is a sequence-to-sequence problem aiming to create a new version of the original text removing complex and domain-specific words. It can improve communication and understanding of documents from specific domains, as well as support second language learning. This paper presents an empirical study on the use of state-of-the-art ATS methods to simplify texts in Portuguese. It is important to remark that the literature reports the challenge in analyzing Portuguese texts due to the lack of resources compared to other languages (i.e., English). More specifically, this work evaluated different Neural Machine Translation (NMT) techniques for ATS in Portuguese. The experiments showed that NMT achieved promising results in Portuguese texts, obtaining 40.89 BLEU score using multiple parallel corpora and raising the overall readability score by more than 5 points.
Palavras-chave:
Text simplification, Machine Translation, Deep learning, Natural Language Processing
Publicado
29/11/2021
Como Citar
LIMA, Tiago B. de; NASCIMENTO, André C. A.; VALENÇA, George; MIRANDA, Pericles; MELLO, Rafael Ferreira; SI, Tapas.
Portuguese Neural Text Simplification Using Machine Translation. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 10. , 2021, Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2021
.
ISSN 2643-6264.