Aprendizado Federado com Geração de Embeddings para Controle da Heterogeneidade Estatística

Gustavo S. Guaragna; Joahannes B. D. da Costa; Leandro A. Villas; Allan M. de Souza

doi:10.5753/sbrc.2026.19960

Gustavo S. Guaragna UNICAMP http://orcid.org/0009-0003-3485-1938
Joahannes B. D. da Costa UNIFESP https://orcid.org/0000-0001-9973-2479
Leandro A. Villas UNICAMP https://orcid.org/0000-0002-3372-3366
Allan M. de Souza UNICAMP https://orcid.org/0000-0002-5518-8392

DOI: https://doi.org/10.5753/sbrc.2026.19960

Resumo

O Aprendizado Federado permite o treinamento colaborativo de modelos de aprendizado de máquina sem o compartilhamento de dados locais, sendo uma alternativa promissora diante de crescentes preocupações com a privacidade. Contudo, a heterogeneidade na distribuição dos dados entre os clientes permanece um dos principais desafios, afetando negativamente o desempenho dos modelos. Neste trabalho, propomos o FLEG, uma abordagem que alterna o treinamento de um classificador com o de uma Rede Adversária Generativa Condicional (CGAN) para aumentar os conjuntos de dados dos clientes e reduzir a heterogeneidade estatística da federação e, consequentemente, melhorar o desempenho do modelo classificador. Diferentemente de abordagens convencionais, o FLEG gera embeddings sintéticos em vez de imagens, adicionando uma camada extra de proteção a possíveis vazamentos de dados. Os resultados experimentais mostram que o FLEG supera a baseline FedAvg em até 14 pontos percentuais na acurácia de validação no conjunto CIFAR-10, nas configurações avaliadas. O código está disponível em https://github.com/gustavoguaragna/FLEG.

Referências

Acar, D. A. E., Zhao, Y., Navarro, R. M., Mattina, M., Whatmough, P. N., and Saligrama, V. (2021). Federated learning based on dynamic regularization.

Ahmed, N., Wahed, M., and Thompson, N. C. (2023). The growing influence of industry in ai research. Science, 379(6635):884–886.

Alzubaidi, L., Zhang, J., Humaidi, A. J., and et al. (2021). Review of deep learning: concepts, cnn architectures, challenges, applications, future directions. Journal of Big Data, 8:53.

Capanema, C. G. S., de Souza, A. M., da Costa, J. B. D., Silva, F. A., Villas, L. A., and Loureiro, A. A. F. (2025). A novel prediction technique for federated learning. IEEE Transactions on Emerging Topics in Computing, 13(1):5–21.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357.

Duan, M., Liu, D., Chen, X., Liu, R., Tan, Y., and Liang, L. (2020). Self-balancing federated learning with global imbalanced data in mobile systems. IEEE Transactions on Parallel and Distributed Systems, 32(1):59–71.

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014). Generative adversarial networks.

Guaragna, G. S., Da Costa, J. B. D., and De Souza, A. M. (2025). Federated learning with iterative synthetic data augmentation.

Huangsuwan, K., Liu, T., See, S., Beng Ng, A., and Vateekul, P. (2025). Feddrip: Federated learning with diffusion-generated synthetic image. IEEE Access, 13:10111–10125.

Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., and Kim, S. (2018). Communication-efficient on-device machine learning: Federated distillation and augmentation under non-iid private data. CoRR, abs/1811.11479.

Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S. J., Stich, S. U., and Suresh, A. T. (2021). Scaffold: Stochastic controlled averaging for federated learning.

Krizhevsky, A., Hinton, G., et al. (2009). Learning multiple layers of features from tiny images. LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (2002). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324.

Li, P., Zhang, H., Wu, Y., Qian, L., Yu, R., Niyato, D., and Shen, X. (2024a). Filling the missing: Exploring generative ai for enhanced federated learning over heterogeneous mobile edge devices. IEEE Transactions on Mobile Computing, 23(10):10001–10015.

Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., and Smith, V. (2020). Federated optimization in heterogeneous networks.

Li, Z., Shao, J., Mao, Y., Wang, J. H., and Zhang, J. (2022). Federated learning with gan-based data synthesis for non-iid clients.

Li, Z., Sun, Y., Shao, J., Mao, Y., Wang, J. H., and Zhang, J. (2024b). Feature matching data synthesis for non-iid federated learning. IEEE Transactions on Mobile Computing, 23(10):9352–9367.

Maciel, F., da Costa, J. B. D., Gonzalez, L. F. G., de Souza, A. M., Villas, L. A., and Bittencourt, L. F. (2024). Adaptive fit fraction based on model performance evolution in federated learning. In 2024 11th International Conference on Future Internet of Things and Cloud (FiCloud), pages 77–84.

Maliakel, P. J., Ilager, S., and Brandic, I. (2024). Fligan: Enhancing federated learning with incomplete data using gan.

McMahan, H. B., Moore, E., Ramage, D., Hampson, S., and y Arcas, B. A. (2016). Federated learning of deep networks using model averaging. CoRR, abs/1602.05629.

Mirza, M. and Osindero, S. (2014). Conditional generative adversarial nets. Pan, H., Hong, Z., Durak, G., Xu, Z., and Bagci, U. (2025). Federated breast cancer detection enhanced by synthetic ultrasound image augmentation.

Pennisi, M., Salanitri, F. P., Bellitto, G., Casella, B., Aldinucci, M., Palazzo, S., and Spampinato, C. (2023). Feder: Federated learning through experience replay and privacy-preserving data synthesis.

Salvo, F. D., Nguyen, H. H. M., and Ledig, C. (2025). Embedding-based federated data sharing via differentially private conditional vaes.

Salvo, F. D., Tafler, D., Doerrich, S., and Ledig, C. (2024). Privacy-preserving datasets by capturing feature distributions with conditional vaes. In 35th British Machine Vision Conference 2024, BMVC 2024, Glasgow, UK, November 25-28, 2024. BMVA.

Wu, Q., Chen, X., Zhou, Z., and Zhang, J. (2020). Fedhome: Cloud-edge based personalized federated learning for in-home health monitoring. IEEE Transactions on Mobile Computing, 21(8):2818–2832.

Yonetani, R., Takahashi, T., Hashimoto, A., and Ushiku, Y. (2019). Decentralized learning of generative adversarial networks from non-iid data.

Yoshida, N., Nishio, T., Morikura, M., Yamamoto, K., and Yonetani, R. (2020). Hybrid-fl for wireless networks: Cooperative learning mechanism using non-iid data.

Yu, S., Zhu, K., Liang, F., Wang, J., Kant, K., and Yin, L. (2026). Robust multimodal federated learning for non-iid multimodal data with incompleteness. Future Generation Computer Systems, 174:107948.

Zhao, Y., Li, M., Lai, L., Suda, N., Civin, D., and Chandra, V. (2018). Federated learning with non-iid data. arXiv preprint arXiv:1806.00582.

Zhao, Z., Yang, F., and Liang, G. (2023). Federated learning based on diffusion model to cope with non-iid data. In Pattern Recognition and Computer Vision: 6th Chinese Conference, PRCV 2023, Xiamen, China, October 13–15, 2023, Proceedings, Part IX, page 220–231, Berlin, Heidelberg. Springer-Verlag.

Zhu, H., Xu, J., Liu, S., and Jin, Y. (2021). Federated learning on non-iid data: A survey.

Aprendizado Federado com Geração de Embeddings para Controle da Heterogeneidade Estatística

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)