Portuguese Neural Text Simplification Using Machine Translation

  • Tiago B. de Lima UFRPE https://orcid.org/0000-0002-0707-522X
  • André C. A. Nascimento UFRPE
  • George Valença UFRPE
  • Pericles Miranda UFRPE
  • Rafael Ferreira Mello UFRPE
  • Tapas Si Bankura Unnayani Institute of Engineering

Resumo


Automatic Text Simplification (ATS) has played a significant role in the Natural Language Processing (NLP) field. ATS is a sequence-to-sequence problem aiming to create a new version of the original text removing complex and domain-specific words. It can improve communication and understanding of documents from specific domains, as well as support second language learning. This paper presents an empirical study on the use of state-of-the-art ATS methods to simplify texts in Portuguese. It is important to remark that the literature reports the challenge in analyzing Portuguese texts due to the lack of resources compared to other languages (i.e., English). More specifically, this work evaluated different Neural Machine Translation (NMT) techniques for ATS in Portuguese. The experiments showed that NMT achieved promising results in Portuguese texts, obtaining 40.89 BLEU score using multiple parallel corpora and raising the overall readability score by more than 5 points.
Palavras-chave: Text simplification, Machine Translation, Deep learning, Natural Language Processing
Publicado
29/11/2021
Como Citar

Selecione um Formato
LIMA, Tiago B. de; NASCIMENTO, André C. A.; VALENÇA, George; MIRANDA, Pericles; MELLO, Rafael Ferreira; SI, Tapas. Portuguese Neural Text Simplification Using Machine Translation. In: BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS), 10. , 2021, Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . ISSN 2643-6264.