Simplifying Administrative Texts for Plain Language using LLM: a Model Comparative Analysis

João Pedro Holanda; Regis Magalhães; Camilo Almendra; Luis Gustavo Coutinho do Rêgo

doi:10.5753/sbsi.2026.248675

João Pedro Holanda UFC
Regis Magalhães UFC
Camilo Almendra UFC
Luis Gustavo Coutinho do Rêgo IFCE

DOI: https://doi.org/10.5753/sbsi.2026.248675

Resumo

Research Context: Plain language is a growing tendency to create more inclusive and readable documents. Open and closed LLMs can be used to simplify documents, with varying costs and tradeoffs. Scientific and/or Practical Problem: Currently, the UFC Inova institution manually writes simplified versions of their public notices, causing longer publication time and repetitive effort. Proposed Solution and/or Analysis: We design a LLM-based pipeline that generates simplified versions of public notices with zero-shot prompting following plain language directives. We investigate how open LLMs compare to proprietary models. Although proprietary solutions are the cutting-edge models, adoption of open LLMs lowers the cost of ownership and avoids vendor lock-in, making them a sustainable choice for public-sector universities. Related IS Theory: We frame this work as a Socio-Technical Systems intervention to enhance access to public information, grounding it in prior research on text complexity and plain-language communication. Research Method: We evaluated a text simplification pipeline applied against different LLMs. The original and AI-generated versions were compared using statistical readability indexes and morphosyntactic metrics. Afterwards, two pairs of documents were evaluated in a survey with readers’ representatives. Summary of Results: The quantitative evaluations indicate that Gemini Flash outperformed the Pro version on both set of metrics, and the Qwen2.5:14b open model was closest to both in the morphosyntactic aspect. Regarding the qualitative evaluation, we observed that automated simplification was well received, but it may better support readers when combined with summarization. Contributions and Impact to IS area: This work provides a process to foster the adoption of plain language in the public sector, along with empirical evidence on the effectiveness of open LLMs compared to proprietary models.

Referências

Almeida, P. C. ., Pozzobon, L. C. ., Figueiredo, J. C. ., Righini, J. C. ., Roedel, P. C. ., Duarte, A. C. ., Costa, L. S. C. ., Quental, C. C. ., Tabak, S. C. ., and Cruz, F. O. (2024). Simples assim: comunique com todo mundo. Accessed: Jul 02, 2025.

CGE (2021). Cartilha Como Usar a Linguagem Simples – tornando as comunicações internas e com a sociedade mais fáceis de entender. [link]. Accessed: Aug 08, 2025.

Cuesta, A. M., Reyes, A., and Roseth, B. (2019). The Importance of Clarity: Impacts of Colombia’s ’Lenguaje Claro’ Program on Reducing Administrative Burdens. IDB Publications. Publisher: Inter-American Development Bank.

Day, S. L., Cirica, J., Clapp, S. R., Penkova, V., Giroux, A. E., Banta, A., Bordeau, C., Mutteneni, P., and Sawyer, B. D. (2025). Evaluating GenAI for Simplifying Texts for Education: Improving Accuracy and Consistency for Enhanced Readability. [link]. Accessed: Dec 04, 2025.

de Sousa, C. M. A. d. O. A., Cardoso, E., and de Andrade, F. D. (2024). DIRETRIZES PARA O USO DE LINGUAGEM SIMPLES: PESQUISA E DESENVOLVIMENTO NO BRASIL E EM PORTUGAL. Revista da Associação Brasileira de Atividade Motora Adaptada, 25(2):407–422.

Easterbrook, S., Singer, J., Storey, M., and Damian, D. (2008). Selecting Empirical Methods for Software Engineering Research, pages 285–311. Springer, London.

Farajidizaji, A., Raina, V., and Gales, M. (2024). Is It Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models. In Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., and Xue, N., editors, Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 9325–9339, Torino, Italia. ELRA and ICCL.

Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3):221–233.

Freyer, N., Kempt, H., and Klöser, L. (2024). Easy-read and large language models: on the ethical dimensions of llm-based text simplification. Ethics and Information Technology, 26(3):50.

Färber, M., Aghdam, P., Im, K., Tawfelis, M., and Ghoshal, H. (2025). SimplifyMyText: An LLM-Based System for Inclusive Plain Language Text Simplification. In Hauff, C., Macdonald, C., Jannach, D., Kazai, G., Nardini, F. M., Pinelli, F., Silvestri, F., and Tonellotto, N., editors, Advances in Information Retrieval, pages 418–424, Cham. Springer Nature Switzerland.

Hartmann, N. S. and Aluísio, S. M. (2020). Adaptação Lexical Automática em Textos Informativos do Português Brasileiro para o Ensino Fundamental. Linguamática, 12(2):3–27. Number: 2.

Holanda, J. P. P., Magalhães, R. P., Almendra, C. C., and do Rêgo, L. G. C. (2026). Simplifying Administrative Texts for Plain Language using LLM: a Comparative Analysis: Results for Morphosyntactic and Readability Indexes on Public Notices and their AI Generated Plain Language Versions. DOI: 10.6084/m9.figshare.29376692.

IFMT (2021). Cartilha Orientativa sobre o Uso de Linguagem Simples no Contexto do Instituto Federal de Mato Grosso. [link]. Accessed: Aug 08, 2025.

IFPE (2023). Guia para Comunicação Interna e Externa do IFPE. [link]. Accessed: Aug 08, 2025.

International Organization for Standardization (2023). ISO 24495-1 Plain language Part 1: Governing principles and guidelines. Standard ISO 24495-1:2023, International Organization for Standardization (ISO).

International Organization for Standardization (2025). ISO/PRF 24495-2 Plain language Part 2: Legal communication. Standard, International Organization for Standardization (ISO).

Kitchenham, B. A. and Pfleeger, S. L. (2008). Personal Opinion Surveys. In Guide to Advanced Empirical Software Engineering, pages 63–92. Springer London, London.

Leal, S. E., Duran, M. S., Scarton, C. E., Hartmann, N. S., and Aluísio, S. M. (2024). NILC-Metrix: assessing the complexity of written and spoken language in Brazilian Portuguese. Language Resources and Evaluation, 58(1):73–110.

Linäker, J., Sulaman, S. M., Maiani de Mello, R., and Höst, M. (2015). Guidelines for conducting surveys in software engineering. Technical Report 8ac54dbe-b7ac-4244-9c43-0f0d157efa26, Lund University.

Martins, H. T., da Silva, A. R., and Cavalcanti, M. T. (2023). Linguagem Simples: um movimento social por transparência, cidadania e acessibilidade. Cadernos do Desenvolvimento Fluminense, (25).

Moreno, G. C. d. L., de Souza, M. P. M., Hein, N., and Hein, A. K. (2023). ALT: UM SOFTWARE PARA ANÁLISE DE LEGIBILIDADE DE TEXTOS EM LÍ NGUA PORTUGUESA. Policromias - Revista de Estudos do Discurso, Imagem e Som, 8(1):91–128.

Pardo, T. A. S., Duran, M. S., Lopes, L., Felippo, A. D., Roman, N. T., and Nunes, M. d. G. V. (2021). Porttinari - a Large Multi-genre Treebank for Brazilian Portuguese. In Simpósio Brasileiro de Tecnologia da Informação e da Linguagem Humana (STIL), pages 1–10. SBC.

Picton, B., Andalib, S., Spina, A., Camp, B., Solomon, S. S., Liang, J., Chen, P. M., Chen, J. W., Hsu, F. P., and Oh, M. Y. (2025). Assessing AI Simplification of Medical Texts: Readability and Content Fidelity. International Journal of Medical Informatics, 195:105743.

Plain Language Association International (PLAIN). What is plain language? [link]. Accessed: May 05, 2025.

Shardlow, M. (2014). A Survey of Automated Text Simplification. International Journal of Advanced Computer Science and Applications, 4(1).

Silveira, V. I. S., Menezes, P. H. C., Silva, M. S., Carmo, F. A., and Lobato, F. M. F. (2024). Classificação de Linguagem Simples: uma abordagem baseada em Leiturabilidade e Legibilidade. In Workshop de Computação Aplicada em Governo Eletrônico ( WCGE), pages 99–110. SBC. ISSN: 2763-8723.

Straka, M. (2018). UDPipe 2.0 Prototype at CoNLL 2018 UD Shared Task. In Zeman, D. and Hajič, J., editors, Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 197–207, Brussels, Belgium. Association for Computational Linguistics.

Swanson, K., He, S., Calvano, J., Chen, D., Telvizian, T., Jiang, L., Chong, P., Schwell, J., Mak, G., and Lee, J. (2024). Biomedical text readability after hypernym substitution with fine-tuned large language models. PLOS Digital Health, 3(4):e0000489. Publisher: Public Library of Science.

UEG (2023). Uso da linguagem simples no âmbito da Universidade Estadual de Goiás. [link]. Acessed: Jun 16, 2025.

UNICAMP (2024). Guia de Linguagem Simples. [link]. Accessed: May 14, 2025.

Yu, L., Alégroth, E., Chatzipetrou, P., and Gorschek, T. (2025). Measuring the quality of generative ai systems: Mapping metrics to quality characteristics — snowballing literature review. Information and Software Technology, 186:107802.

ÍRIS (2022). Guia Íris de Simplificação: Linguagem Simples e Direito Visual. [link]. Accessed: Aug 08, 2025.

Simplifying Administrative Texts for Plain Language using LLM: a Model Comparative Analysis

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)