Aprendizado Federado com Agrupamento Hierárquico de Clientes para Aumento da Acurácia
Abstract
Federated learning performance depends on the data distribution, deteriorating in scenarios in which clients hold heterogeneous data. We propose a hierarchical client clustering system to mitigate the performance problems of federated learning in non-Independent and Identically Distributed (IID) scenarios. Our proposal efficiently groups clients with approximately IID data distribution, achieving fast and accurate model convergence. We initialize the system executing a clustering unsupervised learning algorithm on the bias vector of the last layer of the clients' neural network in the server. The DBSCAN algorithm demonstrated better clustering results, correctly identifying clusters even when all clients have datasets with IID distributions. Finally, the results show an increase of model accuracy up to 16% compared to the traditional federated learning non-IID scenarios.References
Ankerst, M., Breunig, M. M., Kriegel, H.-P. e Sander, J. (1999). OPTICS: Ordering Points to Identify the Clustering Structure. ACM Sigmod record, páginas 49–60.
Beutel, D. J. et al. (2020). Flower: A Friendly Federated Learning Research Framework. arXiv preprint arXiv:2007.14390.
Bochie, K. et al. (2021). Análise do Aprendizado Federado em Redes Móveis. Em SBRC, páginas 71–84.
de Souza, L. A. C. et al. (2020). DFedForest: Decentralized Federated Forest. Em International Conference on Blockchain, páginas 90–97. IEEE.
Dennis, D. K., Li, T. e Smith, V. (2021). Heterogeneity for the Win: One-Shot Federated Clustering. arXiv preprint arXiv:2103.00697.
Desai, H. B., Ozdayi, M. S. e Kantarcioglu, M. (2021). BlockFLA: Accountable Federated Learning via Hybrid Blockchain Architecture. Em Proceedings of ACM Conference on Data and Application Security and Privacy, páginas 101–112.
Douceur, J. R. (2002). The Sybil Attack. Em International Workshop on Peer-to-Peer Systems, páginas 251–260. Springer.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X. et al. (1996). A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Em KDD, páginas 226–231.
Ghosh, A., Chung, J., Yin, D. e Ramchandran, K. (2020). An Efficient Framework for Clustered Federated Learning. arXiv preprint arXiv:2006.04088.
Krizhevsky, A. et al. (2009). Learning Multiple Layers of Features from Tiny Images. Citeseer.
Lai, F., Zhu, X., Madhyastha, H. V. e Chowdhury, M. (2021). Oort: Efficient Federated Learning via Guided Participant Selection. Em OSDI, páginas 19–35.
Liu, L., Zhang, J., Song, S. e Letaief, K. B. (2020). Client-Edge-Cloud Hierarchical Federated Learning. Em International Conference on Communications, páginas 1–6.
Luo, B. et al. (2021). Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling. arXiv preprint arXiv:2112.11256.
MacQueen, J. et al. (1967). Some Methods for Classification and Analysis of Multivariate Observations. Em Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, páginas 281–297. Oakland, CA, USA.
McMahan, B. et al. (2017). Communication-efficient Learning of Deep Networks from Decentralized Data. Artificial Intelligence and Statistics, páginas 1273–1282.
Neto, H. N. et al. (2021). FedSA: Arrefecimento Simulado Federado para a Aceleração da Detecção de Intrusão em Ambientes Colaborativos. Em SBRC, páginas 280–293.
Nishio, T. e Yonetani, R. (2019). Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge. Em International Conference on Communications, páginas 1–7.
Ouyang, X. et al. (2021). ClusterFL: a Similarity-Aware Federated Learning System for Human Activity Recognition. Em Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services, páginas 54–66.
Pedregosa, F. et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Sattler, F., Müller, K.-R. e Samek, W. (2020). Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization under Privacy Constraints. Transactions on Neural Networks and Learning Systems.
Sun, Z., Kairouz, P., Suresh, A. T. e McMahan, H. B. (2019). Can You Really Backdoor Federated Learning? arXiv preprint arXiv:1911.07963.
Wang, H., Kaplan, Z., Niu, D. e Li, B. (2020). Optimizing Federated Learning on Non-IID Data with Reinforcement Learning. Em INFOCOM, páginas 1698–1707.
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D. et al. (2018). Federated Learning with Non-IID Data. arXiv preprint arXiv:1806.00582.
Beutel, D. J. et al. (2020). Flower: A Friendly Federated Learning Research Framework. arXiv preprint arXiv:2007.14390.
Bochie, K. et al. (2021). Análise do Aprendizado Federado em Redes Móveis. Em SBRC, páginas 71–84.
de Souza, L. A. C. et al. (2020). DFedForest: Decentralized Federated Forest. Em International Conference on Blockchain, páginas 90–97. IEEE.
Dennis, D. K., Li, T. e Smith, V. (2021). Heterogeneity for the Win: One-Shot Federated Clustering. arXiv preprint arXiv:2103.00697.
Desai, H. B., Ozdayi, M. S. e Kantarcioglu, M. (2021). BlockFLA: Accountable Federated Learning via Hybrid Blockchain Architecture. Em Proceedings of ACM Conference on Data and Application Security and Privacy, páginas 101–112.
Douceur, J. R. (2002). The Sybil Attack. Em International Workshop on Peer-to-Peer Systems, páginas 251–260. Springer.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X. et al. (1996). A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Em KDD, páginas 226–231.
Ghosh, A., Chung, J., Yin, D. e Ramchandran, K. (2020). An Efficient Framework for Clustered Federated Learning. arXiv preprint arXiv:2006.04088.
Krizhevsky, A. et al. (2009). Learning Multiple Layers of Features from Tiny Images. Citeseer.
Lai, F., Zhu, X., Madhyastha, H. V. e Chowdhury, M. (2021). Oort: Efficient Federated Learning via Guided Participant Selection. Em OSDI, páginas 19–35.
Liu, L., Zhang, J., Song, S. e Letaief, K. B. (2020). Client-Edge-Cloud Hierarchical Federated Learning. Em International Conference on Communications, páginas 1–6.
Luo, B. et al. (2021). Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling. arXiv preprint arXiv:2112.11256.
MacQueen, J. et al. (1967). Some Methods for Classification and Analysis of Multivariate Observations. Em Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, páginas 281–297. Oakland, CA, USA.
McMahan, B. et al. (2017). Communication-efficient Learning of Deep Networks from Decentralized Data. Artificial Intelligence and Statistics, páginas 1273–1282.
Neto, H. N. et al. (2021). FedSA: Arrefecimento Simulado Federado para a Aceleração da Detecção de Intrusão em Ambientes Colaborativos. Em SBRC, páginas 280–293.
Nishio, T. e Yonetani, R. (2019). Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge. Em International Conference on Communications, páginas 1–7.
Ouyang, X. et al. (2021). ClusterFL: a Similarity-Aware Federated Learning System for Human Activity Recognition. Em Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services, páginas 54–66.
Pedregosa, F. et al. (2011). Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12:2825–2830.
Sattler, F., Müller, K.-R. e Samek, W. (2020). Clustered Federated Learning: Model-Agnostic Distributed Multitask Optimization under Privacy Constraints. Transactions on Neural Networks and Learning Systems.
Sun, Z., Kairouz, P., Suresh, A. T. e McMahan, H. B. (2019). Can You Really Backdoor Federated Learning? arXiv preprint arXiv:1911.07963.
Wang, H., Kaplan, Z., Niu, D. e Li, B. (2020). Optimizing Federated Learning on Non-IID Data with Reinforcement Learning. Em INFOCOM, páginas 1698–1707.
Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D. et al. (2018). Federated Learning with Non-IID Data. arXiv preprint arXiv:1806.00582.
Published
2022-05-23
How to Cite
SOUZA, Lucas Airam C. de; CAMILO, Gustavo F.; SAMMARCO, Matteo; CAMPISTA, Miguel Elias M.; COSTA, Luís Henrique M. K..
Aprendizado Federado com Agrupamento Hierárquico de Clientes para Aumento da Acurácia. In: BRAZILIAN SYMPOSIUM ON COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS (SBRC), 40. , 2022, Fortaleza.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2022
.
p. 545-558.
ISSN 2177-9384.
DOI: https://doi.org/10.5753/sbrc.2022.222371.
