Anonimização de Dados para Inteligência Artificial usando o Algoritmo da Tropa dos Gorilas

Ivo A. Pimenta; Ramon S. Araújo; Renann L. Rodrigues; Matheus M. Silveira; Rafael L. Gomes

doi:10.5753/sbrc.2025.6252

Ivo A. Pimenta UECE http://orcid.org/0009-0005-4571-6242
Ramon S. Araújo UECE
Renann L. Rodrigues UECE
Matheus M. Silveira Uber Technologies Inc.
Rafael L. Gomes UECE https://orcid.org/0000-0001-7922-0695

DOI: https://doi.org/10.5753/sbrc.2025.6252

Resumo

A coleta de dados do ambiente e das pessoas através da Internet das Coisas (IoT) é uma realidade, onde esses dados são usados por soluções inovadoras baseadas em Inteligência Artificial (IA). Contudo, especialmente na área de saúde, esses dados de usuários precisam atender às definições das Leis de Privacidade. Desta forma, há o desafio de entender a utilidade dos dados usados em soluções de IA enquanto cumpre os aspectos legais, por exemplo, anonimizando os dados. Métodos tradicionais de anonimização comprometem a eficácia dos modelos de IA, reduzindo a eficácia dos mesmos. Dentro deste contexto, este artigo propõe o algoritmo GOK − Privacy, que combina uma meta-heurística inspirada no comportamento de gorilas com técnicas de agrupamento, permitindo alcançar a preservação de privacidade sem sacrificar o desempenho dos modelos analíticos. Os experimentos realizados usando dados reais de saúde mostram a eficácia da proposta em cenários reais.

Palavras-chave: Anonimização, Machine learning

Referências

Abdollahzadeh, B., Gharehchopogh, F. S., and Mirjalili, S. (2021). A novel metaheuristic optimization algorithm inspired by gorilla troops’ behaviors. Expert Systems with Applications, 182:115083.

Chiu, C. C. and Tsai, C. Y. (2007). Weighted feature c-means clustering algorithm for data mining in intelligent transportation systems. Expert Systems with Applications, 33(1).

Choudhury, O., Gkoulalas-Divanis, A., Salonidis, T., Sylla, I., Park, Y., Hsu, G., and Das, A. (2020). Anonymizing data for privacy-preserving federated learning. arXiv preprint, arXiv:2002.09096.

El Mestari, S. Z., Lenzini, G., and Demirci, H. (2024). Preserving data privacy in machine learning systems. Computers Security, 137:103605.

Ferreira, M. C., Ribeiro, S. E., Nobre, F. V., Linhares, M. L., Araújo, T. P., and Gomes, R. L. (2024). Mitigating measurement failures in throughput performance forecasting. In 2024 20th International Conference on Network and Service Management (CNSM), pages 1–7.

Gomes, R. L., Bittencourt, L. F., and Madeira, E. R. M. (2014a). A similarity model for virtual networks negotiation. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC ’14, pages 489–494, New York, NY, USA. Association for Computing Machinery.

Gomes, R. L., Bittencourt, L. F., Madeira, E. R. M., Cerqueira, E., and Gerla, M. (2014b). An architecture for dynamic resource adjustment in VSDNs based on traffic demand. In 2014 IEEE Global Communications Conference, pages 2005–2010.

He, X., Chen, H., Chen, Y., Dong, Y., Wang, P., and Huang, Z. (2012). Clustering-based k-anonymity. In Advances in Knowledge Discovery and Data Mining: 16th Pacific-Asia Conference, PAKDD 2012, Kuala Lumpur, Malaysia, May 29-June 1, 2012, Proceedings, Part I 16, pages 405–417. Springer.

Hussain, F., Abbas, S. G., Shah, G. A., Pires, I. M., Fayyaz, U. U., Shahzad, F., Garcia, N. M., and Zdravevski, E. (2021). A framework for malicious traffic detection in IoT healthcare environment. Sensors, 21(9):3025.

Kacha, L., Zitouni, A., and Djoudi, M. (2021). KAB: A new k-anonymity approach based on black hole algorithm. Journal of King Saud University - Computer and Information Sciences.

Kumar, R., Chen, W., and Smith, S. (2024). Privacy-preserving machine learning through k-anonymity: A novel approach for healthcare data protection. Journal of Medical Systems, 48(1):1–15.

Langari, R. K., Sardar, S., Mousavi, S. A. A., and Radfar, R. (2020). Combined fuzzy clustering and firefly algorithm for privacy preserving in social networks. Expert Systems With Applications, 141:112968.

LeFevre, K., DeWitt, D. J., and Ramakrishnan, R. (2006). Mondrian multidimensional k-anonymity. In 22nd International Conference on Data Engineering (ICDE’06), pages 25–25. IEEE.

Ni, C., Cang, L. S., Gope, P., and Min, G. (2022). Data anonymization evaluation for big data and IoT environment. Information Sciences, 605:381–392.

Pimenta, I. A., Silva, D. A., Moura, E. S., Silveira, M. M., and Gomes, R. L. (2024). Impact of data anonymization in machine learning models. In 13th Latin-American Symposium on Dependable and Secure Computing (LADC 2024), pages 1–4, Recife, Brazil.

Portela, A. L., Menezes, R. A., Costa, W. L., Silveira, M. M., Bittencourt, L. F., and Gomes, R. L. (2023). Detection of IoT devices and network anomalies based on anonymized network traffic. In NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pages 1–6.

Portela, A. L. C., Ribeiro, S. E. S. B., Menezes, R. A., de Araujo, T., and Gomes, R. L. (2024). T-FOR: An adaptable forecasting model for throughput performance. IEEE Transactions on Network and Service Management, pages 1–1.

Seh, A. H., Zarour, M., Alenezi, M., Sarkar, A. K., Agrawal, A., Kumar, R., and Khan, R. A. (2020). Healthcare data breaches: Insights and implications. Healthcare, 8(2):133.

Silva, M., Ribeiro, S., Carvalho, V., Cardoso, F., and Gomes, R. L. (2023). Scalable detection of SQL injection in cyber physical systems. In Proceedings of the 12th Latin-American Symposium on Dependable and Secure Computing, LADC ’23, pages 220–225, New York, NY, USA. Association for Computing Machinery.

Silva, M. V., Mosca, E. E., and Gomes, R. L. (2022). Green industrial internet of things through data compression. International Journal of Embedded Systems, 15(6):457–466.

Silveira, M., Santos, D., Souza, M., Silva, D., Mesquita, M., Neto, J., and Gomes, R. L. (2023a). An anonymization service for privacy in data mining. In Proceedings of the 12th Latin-American Symposium on Dependable and Secure Computing, LADC ’23, pages 214–219, New York, NY, USA. Association for Computing Machinery.

Silveira, M. M., Portela, A. L., Menezes, R. A., Souza, M. S., Silva, D. S., Mesquita, M. C., and Gomes, R. L. (2023b). Data protection based on searchable encryption and anonymization techniques. In NOMS 2023-2023 IEEE/IFIP Network Operations and Management Symposium, pages 1–5.

Silveira, M. M., Silva, D. S., Rodriguez, S. J. R., and Gomes, R. L. (2023c). Searchable symmetric encryption for private data protection in cloud environments. In Proceedings of the 11th Latin-American Symposium on Dependable Computing, LADC ’22, pages 95–98, New York, NY, USA. Association for Computing Machinery.

Slijepčević, D., Henzl, M., Klausner, L. D., Dam, T., Kieseberg, P., and Zeppelzauer, M. (2021). k-anonymity in practice: How generalisation and suppression affect machine learning classifiers. Computers & Security, 111:102488.

Souza, M. S., Ribeiro, S. E. S. B., Lima, V. C., Cardoso, F. J., and Gomes, R. L. (2024). Combining regular expressions and machine learning for SQL injection detection in urban computing. Journal of Internet Services and Applications, 15(1):103–111.

Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., and Fu, A. W.-C. (2006). Utility-based anonymization using local recoding. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 785–790. ACM.

Yuan, S. and Wu, X. (2022). Trustworthy anomaly detection: A survey. arXiv preprint, arXiv:2202.07787.

Anonimização de Dados para Inteligência Artificial usando o Algoritmo da Tropa dos Gorilas

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)