Rápido, Privado e Protegido: Uma Abordagem para Aprendizado Federado Eficiente em Ambiente Hostil

Nicolas R. G. Assumpção; Leandro A. Villas

doi:10.5753/courb.2024.2523

Nicolas R. G. Assumpção Unicamp
Leandro A. Villas Unicamp

DOI: https://doi.org/10.5753/courb.2024.2523

Resumo

O Aprendizado Federado (Federated Learning – FL) é um método de treinamento distribuído em que dispositivos colaboram para criar um modelo global sem compartilhar dados, permitindo treinamento em cenários com informações privadas. Entretanto, garantir a privacidade dos dados ao mesmo tempo que se protege a convergência do modelo é um grande desafio, dado que as soluções normalmente conseguem abranger apenas uma dessas duas proteções. Neste trabalho, introduzimos o RPP (Rápido, Privado e Protegido), uma abordagem de rápida convergência e que protege o treinamento contra ataques de envenenamento de modelo ao mesmo tempo que possibilita o uso de técnicas de criptografia homomórfica para proteger a privacidade dos dados. Isso é feito ao usar as avaliações dos clientes para avaliar as rodadas anteriores e recuperar o treinamento após um ataque agressivo. O RPP utiliza valores de reputação para dificultar que atacantes sejam selecionados. Experimentos realizados compararam o RPP com outras abordagens da literatura (FedAvg, PoC, Agregação por Mediana e Agregação por Média Podada) e mostraram como o RPP obteve uma convergência rápida e consistente em cenários onde todas as outras falharam em convergir.

Referências

Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., and Shmatikov, V. (2020). How to backdoor federated learning. In International conference on artificial intelligence and statistics, pages 2938–2948. PMLR.

Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., McMahan, H. B., Patel, S., Ramage, D., Segal, A., and Seth, K. (2017). Practical secure aggregation for privacy-preserving machine learning. In proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1175–1191.

Brendan McMahan, H., Moore, E., Ramage, D., Hampson, S., and Agüera y Arcas, B. (2016). Communication-efficient learning of deep networks from decentralized data. arXiv e-prints, pages arXiv–1602.

Caldas, S., Duddu, S. M. K., Wu, P., Li, T., Konečnỳ, J., McMahan, H. B., Smith, V., and Talwalkar, A. (2018). Leaf: A benchmark for federated settings. arXiv preprint arXiv:1812.01097.

Cho, Y. J., Wang, J., and Joshi, G. (2020). Client selection in federated learning: Convergence analysis and power-of-choice selection strategies. arXiv preprint arXiv:2010.01243.

Kang, J., Xiong, Z., Niyato, D., Xie, S., and Zhang, J. (2019). Incentive mechanism for reliable federated learning: A joint optimization approach to combining reputation and contract theory. IEEE Internet of Things Journal, 6(6):10700–10714.

Kariyappa, S., Guo, C., Maeng, K., Xiong, W., Suh, G. E., Qureshi, M. K., and Lee, H.-H. S. (2023). Cocktail party attack: Breaking aggregation-based privacy in federated learning using independent component analysis. In International Conference on Machine Learning, pages 15884–15899. PMLR.

Lyu, L., Yu, H., and Yang, Q. (2020). Threats to federated learning: A survey. arXiv preprint arXiv:2003.02133.

Nair, A. K., Raj, E. D., and Sahoo, J. (2023). A robust analysis of adversarial attacks on federated learning environments. Computer Standards & Interfaces, page 103723.

Pillutla, K., Kakade, S. M., and Harchaoui, Z. (2019). Robust aggregation for federated learning. arXiv preprint arXiv:1912.13445.

Wang, H., Kaplan, Z., Niu, D., and Li, B. (2020). Optimizing federated learning on non-iid data with reinforcement learning. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pages 1698–1707. IEEE.

Xie, C., Koyejo, O., and Gupta, I. (2018). Generalized byzantine-tolerant sgd. arXiv preprint arXiv:1802.10116.

Yin, D., Chen, Y., Kannan, R., and Bartlett, P. (2018). Byzantine-robust distributed learning: Towards optimal statistical rates. In International Conference on Machine Learning, pages 5650–5659. PMLR.

Zhang, W., Wang, X., Zhou, P., Wu, W., and Zhang, X. (2021). Client selection for federated learning with non-iid data in mobile edge computing. IEEE Access, 9:24462–24474.

Zhu, L., Liu, Z., and Han, S. (2019). Deep leakage from gradients. Advances in neural information processing systems, 32.