Middleware para Interceptação de Dados Sensíveis em Chatbots com Modelos de Linguagem de Grande Porte (LLMs)
Resumo
Este artigo apresenta a proposta de solução técnica para a emissão de alertas preventivos durante a exibição de respostas geradas por chatbots baseados em Modelos de Linguagem de Grande Porte (LLMs). A técnica consiste na interceptação do conteúdo gerado pelo modelo no momento da exibição, com o objetivo de identificar a presença de termos potencialmente sensíveis, conforme os princípios da LGPD, e exibir alertas visuais ao usuário.Referências
Barberá, I. (2025). AI Privacy Risks & Mitigations – Large Language Models (LLMs). [link]. Commissioned by the European Data Protection Board (EDPB) under the Support Pool of Experts (SPE). Views expressed are those of the author.
Freiberger, V., Fleig, A., and Buchmann, E. (2025). “you don’t need a university degree to comprehend data protection this way”: Llm-powered interactive privacy policy assessment. In CHI EA ’25: Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–12. Association for Computing Machinery. Published: 25 April 2025.
Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., and Fritz, M. (2023). Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, AISec ’23, page 79–90, New York, NY, USA. Association for Computing Machinery.
Jana, S., Biswas, R., Pal, K., Biswas, S., and Roy, K. (2024). The evolution and impact of large language model systems: A comprehensive analysis. Alochana Journal, 13(3):65–78.
LGPD - Brasil (2018). Lei nº 13.709, de 14 de agosto de 2018: Lei Geral de Proteção de Dados Pessoais (LGPD). [link]. Diário Oficial da União, Seção 1, 15 ago. 2018.
Li, T., Das, S., Lee, H.-P., Wang, D., Yao, B., and Zhang, Z. (2024). Human-centered privacy research in the age of large language models. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’24), pages 1–4. Association for Computing Machinery.
Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X., and Gao, J. (2025). Large language models: A survey. arXiv preprint arXiv:2402.06196.
Mireshghallah, N., Antoniak, M., More, Y., Choi, Y., and Farnadi, G. (2024). Trust no bot: Discovering personal disclosures in human-llm conversations in the wild. arXiv preprint arXiv:2407.11438. Accepted at COLM 2024. Version 2, last revised 20 Jul 2024.
Patwardhan, N., Marrone, S., and Sansone, C. (2023). Transformers in the real world: A survey on nlp applications. Information, 14(4).
Raza, M., Jahangir, Z., Riaz, M. B., Saeed, M. J., and Sattar, M. A. (2025). Industrial applications of large language models. Scientific Reports, 15(1):13755.
Yan, C., Guan, B., Li, Y., Meng, M. H., Wan, L., and Bai, G. (2025). Understanding and detecting file knowledge leakage in gpt app ecosystem. In Proceedings of the ACM on Web Conference 2025, WWW ’25, page 3831–3839, New York, NY, USA. Association for Computing Machinery.
Łajewska, W., Spina, D., Trippas, J., and Balog, K. (2024). Explainability for transparent conversational information-seeking. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024, page 1040–1050. ACM.
Freiberger, V., Fleig, A., and Buchmann, E. (2025). “you don’t need a university degree to comprehend data protection this way”: Llm-powered interactive privacy policy assessment. In CHI EA ’25: Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–12. Association for Computing Machinery. Published: 25 April 2025.
Greshake, K., Abdelnabi, S., Mishra, S., Endres, C., Holz, T., and Fritz, M. (2023). Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, AISec ’23, page 79–90, New York, NY, USA. Association for Computing Machinery.
Jana, S., Biswas, R., Pal, K., Biswas, S., and Roy, K. (2024). The evolution and impact of large language model systems: A comprehensive analysis. Alochana Journal, 13(3):65–78.
LGPD - Brasil (2018). Lei nº 13.709, de 14 de agosto de 2018: Lei Geral de Proteção de Dados Pessoais (LGPD). [link]. Diário Oficial da União, Seção 1, 15 ago. 2018.
Li, T., Das, S., Lee, H.-P., Wang, D., Yao, B., and Zhang, Z. (2024). Human-centered privacy research in the age of large language models. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’24), pages 1–4. Association for Computing Machinery.
Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X., and Gao, J. (2025). Large language models: A survey. arXiv preprint arXiv:2402.06196.
Mireshghallah, N., Antoniak, M., More, Y., Choi, Y., and Farnadi, G. (2024). Trust no bot: Discovering personal disclosures in human-llm conversations in the wild. arXiv preprint arXiv:2407.11438. Accepted at COLM 2024. Version 2, last revised 20 Jul 2024.
Patwardhan, N., Marrone, S., and Sansone, C. (2023). Transformers in the real world: A survey on nlp applications. Information, 14(4).
Raza, M., Jahangir, Z., Riaz, M. B., Saeed, M. J., and Sattar, M. A. (2025). Industrial applications of large language models. Scientific Reports, 15(1):13755.
Yan, C., Guan, B., Li, Y., Meng, M. H., Wan, L., and Bai, G. (2025). Understanding and detecting file knowledge leakage in gpt app ecosystem. In Proceedings of the ACM on Web Conference 2025, WWW ’25, page 3831–3839, New York, NY, USA. Association for Computing Machinery.
Łajewska, W., Spina, D., Trippas, J., and Balog, K. (2024). Explainability for transparent conversational information-seeking. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2024, page 1040–1050. ACM.
Publicado
22/09/2025
Como Citar
SILVA, Felipe Diego Lobato da; COLETI, Thiago Adriano.
Middleware para Interceptação de Dados Sensíveis em Chatbots com Modelos de Linguagem de Grande Porte (LLMs). In: WORKSHOP SOBRE BOTS NA ENGENHARIA DE SOFTWARE (WBOTS), 2. , 2025, Recife/PE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 41-46.
DOI: https://doi.org/10.5753/wbots.2025.15215.