Mecanismo para Mitigar Ataques de Envenenamento de Modelo no Aprendizado Federado

Marcos G. O. Morais; Joahannes B. D. da Costa; Luis F. G. Gonzalez; Allan M. de Souza; Leandro A. Villas

doi:10.5753/courb.2024.3398

Marcos G. O. Morais UNICAMP / UFLA
Joahannes B. D. da Costa UNICAMP
Luis F. G. Gonzalez UNICAMP
Allan M. de Souza UNICAMP
Leandro A. Villas UNICAMP

DOI: https://doi.org/10.5753/courb.2024.3398

Resumo

O Federated Learning (FL) é uma técnica distribuída para treinamento de modelos de aprendizado de máquina, em que os dados são processados localmente e apenas parâmetros locais são compartilhados com um servidor de agregação. O fato de que os dados dos clientes são mantidos localmente torna a tarefa de validar os seu parâmetros uma tarefa extremamente difícil, abrindo portas para possíveis ataques de clientes mal intencionados. Esses atacantes influenciam deliberadamente o treinamento, invalidando o modelo resultante, fazendo injeção de dados e até mesmo manipulando os parâmetros dos modelos. Sendo assim, este trabalho apresenta o RAGNAR que se utiliza de métricas calculadas durante o treinamento, para mitigar em tempo real ataques no ambiente de FL. Além disso, o RAGNAR emprega uma estratégia de agregação dos parâmetros de clientes que se adapta dinamicamente, com o objetivo de lidar com a situação atual. Avaliações experimentais demonstram que o RAGNAR reduz significativamente a perda de acurácia (31%) dos clientes participantes do treinamento e mantém bons níveis de acurácia (98%).

Referências

Abdulrahman, S., Tout, H., Ould-Slimane, H., Mourad, A., Talhi, C., and Guizani, M. (2021). A survey on federated learning: The journey from centralized to distributed on-site learning and beyond. IEEE Internet of Things Journal, 8(7):5476–5497.

Beutel, D. J., Topal, T., Mathur, A., Qiu, X., Fernandez-Marques, J., Gao, Y., Sani, L., Kwing, H. L., Parcollet, T., Gusmão, P. P. d., and Lane, N. D. (2020). Flower: A friendly federated learning research framework. arXiv preprint arXiv:2007.14390.

Blanchard, P., El Mhamdi, E. M., Guerraoui, R., and Stainer, J. (2017a). Machine learning with adversaries: Byzantine tolerant gradient descent. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.

Blanchard, P., Mhamdi, E. M. E., Guerraoui, R., and Stainer, J. (2017b). Byzantine-tolerant machine learning. CoRR, abs/1703.02757.

Briggs, C., Fan, Z., and Andras, P. (2020). Federated learning with hierarchical clustering of local updates to improve training on non-iid data. In 2020 International Joint Conference on Neural Networks (IJCNN), pages 1–9.

Cho, Y. J., Wang, J., and Joshi, G. (2020). Client selection in federated learning: Convergence analysis and power-of-choice selection strategies. CoRR, abs/2010.01243.

de Souza, A. M., Bittencourt, L. F., Cerqueira, E., Loureiro, A. A., and Villas, L. A. (2023). Dispositivos, eu escolho vocês: Seleção de clientes adaptativa para comunicação eficiente em aprendizado federado. In Anais do XLI Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, pages 1–14. SBC.

de Souza, A. M., Maciel, F., da Costa, J. B. D., Bittencourt, L. F., Cerqueira, E., Loureiro, A. A., and Villas, L. A. (2024). Adaptive client selection with personalization for communication efficient federated learning. Ad Hoc Networks, page 103462.

Deng, L. (2012). The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Processing Magazine, 29(6):141–142.

Fang, M., Cao, X., Jia, J., and Gong, N. (2020). Local model poisoning attacks to Byzantine-Robust federated learning. In 29th USENIX Security Symposium (USENIX Security 20), pages 1605–1622. USENIX Association.

Han, S., Buyukates, B., Hu, Z., Jin, H., Jin, W., Sun, L., Wang, X., Wu, W., Xie, C., Yao, Y., Zhang, K., Zhang, Q., Zhang, Y., Avestimehr, S., and He, C. (2023). Fedmlsecurity: A benchmark for attacks and defenses in federated learning and federated llms.

Jagielski, M., Oprea, A., Biggio, B., Liu, C., Nita-Rotaru, C., and Li, B. (2018). Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 2018 IEEE Symposium on Security and Privacy (SP), pages 19–35.

Jian, X., Parthasarathy, R., and Huang, B. W. (2023). An exploratory study on the design of emergency first aid privacy protection computing system based on blockchain. In 2023 8th International Conference on Business and Industrial Research (ICBIR), pages 447–451.

Korkmaz, A., Alhonainy, A., and Rao, P. (2022). An evaluation of federated learning techniques for secure and privacy-preserving machine learning on medical datasets. In 2022 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pages 1–7.

Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images. Toronto, ON, Canada.

Liu, Z., Liu, Z., and Yang, X. (2023). Poisoning attack based on data feature selection in federated learning. In 2023 13th International Conference on Cloud Computing, Data Science and Engineering (Confluence), pages 106–110.

Martin, D. and Chai, S. S. (2022). A study on performance comparisons between knn, random forest and xgboost in prediction of landslide susceptibility in kota kinabalu, malaysia. In 2022 IEEE 13th Control and System Graduate Research Colloquium (ICSGRC), pages 159–164.

McMahan, B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR.

Mehmuda, D., Bhagat, C., Patel, D., Captain, K., and Parmar, A. (2023). Defense against byzantine attack in cognitive radio using isolation forest. In 2023 15th International Conference on COMmunication Systems & NETworkS (COMSNETS), pages 314–318. IEEE.

Mhamdi, E. M. E., Guerraoui, R., and Rouault, S. (2018). The hidden vulnerability of distributed learning in byzantium.

Parmar, A., Shah, K., Captain, K. M., López-Benítez, M., and Patel, J. R. (2024). Gaussian mixture model-based anomaly detection for defense against byzantine attack in cooperative spectrum sensing. IEEE Transactions on Cognitive Communications and Networking, 10(2):499–509.

Sattler, F., Müller, K.-R., Wiegand, T., and Samek, W. (2020). On the byzantine robustness of clustered federated learning. In ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8861–8865.

Varma, K., Zhou, Y., Baracaldo, N., and Anwar, A. (2021). Legato: A layerwise gradient aggregation algorithm for mitigating byzantine attacks in federated learning. In 2021 IEEE 14th International Conference on Cloud Computing (CLOUD), pages 272–277.

Yin, D., Chen, Y., Ramchandran, K., and Bartlett, P. (2021). Byzantine-robust distributed learning: Towards optimal statistical rates.