Portuguese Neural Text Simplification Using Machine Translation

  • Tiago B. de Lima UFRPE https://orcid.org/0000-0002-0707-522X
  • André C. A. Nascimento UFRPE
  • George Valença UFRPE
  • Pericles Miranda UFRPE
  • Rafael Ferreira Mello UFRPE
  • Tapas Si Bankura Unnayani Institute of Engineering

Resumo

Automatic Text Simplification (ATS) has played a significant role in the Natural Language Processing (NLP) field. ATS is a sequence-to-sequence problem aiming to create a new version of the original text removing complex and domain-specific words. It can improve communication and understanding of documents from specific domains, as well as support second language learning. This paper presents an empirical study on the use of state-of-the-art ATS methods to simplify texts in Portuguese. It is important to remark that the literature reports the challenge in analyzing Portuguese texts due to the lack of resources compared to other languages (i.e., English). More specifically, this work evaluated different Neural Machine Translation (NMT) techniques for ATS in Portuguese. The experiments showed that NMT achieved promising results in Portuguese texts, obtaining 40.89 BLEU score using multiple parallel corpora and raising the overall readability score by more than 5 points.
Publicado
2021-11-29
Como Citar
LIMA, Tiago B. de et al. Portuguese Neural Text Simplification Using Machine Translation. Anais da Brazilian Conference on Intelligent Systems (BRACIS), [S.l.], nov. 2021. ISSN 2643-6264. Disponível em: <https://sol.sbc.org.br/index.php/bracis/article/view/19090>. Acesso em: 18 maio 2024.