Mediação de Privacidade para Interações com Modelos de Linguagem de Grande Porte

Felipe Diego Lobato da Silva; Thiago Adriano Coleti

doi:10.5753/wics.2026.23591

Felipe Diego Lobato da Silva USP
Thiago Adriano Coleti USP / UENP

DOI: https://doi.org/10.5753/wics.2026.23591

Resumo

Este trabalho apresenta uma ferramenta de mediação de privacidade para interações com modelos de linguagem de grande porte (LLM), implementada como uma extensão de navegador. A ferramenta identifica dados pessoais e sensíveis no texto inserido pelo usuário antes do envio ao modelo, utilizando regras baseadas em padrões e técnicas de reconhecimento de entidades nomeadas (NER). Quando informações sensíveis são detectadas, alertas visuais just-in-time são exibidos, permitindo que o usuário decida se deseja prosseguir, editar ou cancelar o envio da mensagem.

Referências

Angulo, J., Fischer-Hübner, S., Pulls, T., and Wästlund, E. (2011). Towards usable privacy policy display & management: The primelife approach. In Privacy and Identity Management for Life, pages 108–118. Springer.

Baroni, L. and Pereira, R. (2024). Deceptive patterns under a sociotechnical view. In Anais do XXIII Simpósio Brasileiro sobre Fatores Humanos em Sistemas Computacionais, pages 459–471, Porto Alegre, RS, Brasil. SBC.

Betzing, J. H., Tietz, M., vom Brocke, J., and Becker, J. (2019). The impact of transparency on mobile privacy decision making. Electronic Markets, 30(3):607–625.

Chen, C., Zhou, D., Ye, Y., Li, T. J.-J., and Yao, Y. (2025). Clear: Towards contextual llm-empowered privacy policy analysis and risk generation for large language model applications. In Proceedings of the 30th International Conference on Intelligent User Interfaces, IUI ’25’, page 277–297. Association for Computing Machinery.

Elsevier (2026). Document object model. ScienceDirect Topics. Acesso em: 28 mar. 2026.

European Data Protection Supervisor (2023). Generative ai: The data protection implications. Technical report, European Data Protection Supervisor (EDPS).

Freiberger, V., Fleig, A., and Buchmann, E. (2025a). “you don’t need a university degree to comprehend data protection this way”: Llm-powered interactive privacy policy assessment. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA ’25). ACM.

Freiberger, V., Fleig, A., and Buchmann, E. (2025b). “you don’t need a university degree to comprehend data protection this way”: Llm-powered interactive privacy policy assessment. In CHI EA ’25: Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pages 1–12. Association for Computing Machinery. Published: 25 April 2025.

Ischen, C., Araujo, T., Voorveld, H., van Noort, G., and Smit, E. (2020). Privacy concerns in chatbot interactions. In Chatbot Research and Design, volume 11970 of Lecture Notes in Computer Science, pages 34–48. Springer.

Keraghel, I. and Nadif, M. (2025). Named entity recognition in the era of large language models: A comparative study. In 2025 International Conference on Advanced Machine Learning and Data Science (AMLDS), pages 617–623.

Leschanowsky, A., Rech, S., Popp, B., and Bäckström, T. (2024). Evaluating privacy, security, and trust perceptions in conversational ai: A systematic review. Computers in Human Behavior, 159:108344.

Li, T., Das, S., Lee, H.-P. H., Wang, D., Yao, B., and Zhang, Z. (2024). Human-centered privacy research in the age of large language models. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, CHI EA ’24, New York, NY, USA. Association for Computing Machinery.

Lima, D., Vaz, N., Carvalho, S., and Berretta, L. (2025). Usabilidade de interfaces conversacionais com inteligência artificial generativa em aplicações mhealth: Uma revisão sistemática. In Anais da XIII Escola Regional de Informática de Goiás, pages 129–138, Porto Alegre, RS, Brasil. SBC.

Microsoft (2024). Typescript. TypeScript is a strongly typed programming language that builds on JavaScript, giving you better tooling at any scale.

Mina, M., Rodriguez-Penagos, C., Gonzalez-Agirre, A., and Villegas, M. (2024). Extending off-the-shelf ner systems to personal information detection in dialogues with a virtual agent: Findings from a real-life use case. In Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024), pages 44–53.

Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Akhtar, N., Barnes, N., and Mian, A. (2025). A comprehensive overview of large language models. ACM Transactions on Intelligent Systems and Technology, 16(5):106:1–106:72.

Seeger, A.-M., Pfeiffer, J., and Heinzl, A. (2021). Texting with human-like conversational agents: Designing for anthropomorphism. Journal of the Association for Information Systems, 22(4):1–58.

Yadav, V. and Bethard, S. (2018). A survey on recent advances in named entity recognition from deep learning models. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2145–2158, Santa Fe, New Mexico, USA. Association for Computational Linguistics.

Zhang, Z., Jia, M., Lee, H.-P., Yao, B., Das, S., Lerner, A., Wang, D., and Li, T. (2024). “it’s a fair game”, or is it? examining how users navigate disclosure risks and benefits when using llm-based conversational agents. In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, CHI ’24. Association for Computing Machinery.

Zhou, J., Xu, E., Wu, Y., and Li, T. (2025). Rescriber: Smaller-llm-powered user-led data minimization for llm-based chatbots. In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems, CHI ’25. Association for Computing Machinery.