ASTRA: Adaptive Student-Teacher Method for Robust Aggregation and Client Drift Reduction in Federated Learning

João Gonçalves; John Sousa; Rafael Veiga; Lucas Bastos; Lucas Pacheco; Iago Medeiros; Denis Rosário; Eduardo Cerqueira

doi:10.5753/sbrc.2026.19887

João Gonçalves UFPA
John Sousa UFPA
Rafael Veiga UFPA
Lucas Bastos UNIFESPA
Lucas Pacheco UFPA
Iago Medeiros UFPA
Denis Rosário UFPA
Eduardo Cerqueira UFPA

DOI: https://doi.org/10.5753/sbrc.2026.19887

Resumo

Federated Learning (FL) faces critical convergence challenges when client data is heterogeneous (Non-IID). Existing solutions often trade computational efficiency for stability, incurring a high overhead. This paper proposes ASTRA (Adaptive Student-Teacher Method for Robust Aggregation and Client Drift Reduction in Federated Learning), a method that integrates geometric regularization with a dynamic self-distillation mechanism. Unlike static approaches, ASTRA employs a curriculum-based schedule that transitions from intensive guidance to periodic semantic correction. This strategy ensures that local updates remain aligned with the global objective (mitigating client drift), while drastically reducing the computational cost of the teacher model. Experimental results in Non-IID CIFAR-10 demonstrate that ASTRA outperforms the structural baseline (FedProx) by 32.7% in accuracy under severe heterogeneity, effectively preventing catastrophic divergence while maintaining training speeds within 2.3% of the lightweight FedAvg algorithm.

Referências

Amiri, S. et al. (2024). Balancing privacy and performance in federated learning: A systematic literature review on methods and metrics. Journal of Parallel and Distributed Computing.

Cooray, L., Sendanayake, J., Vithanaarachchi, P., and Priyadarshana, Y. H. P. P. (2025). Deep federated learning: a systematic review of methods, applications, and challenges. Frontiers in Computer Science, Volume 7 - 2025.

Hinton, G., Vinyals, O., and Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.

Jeong, E., Oh, S., Kim, H., Park, J., Bennis, M., and Kim, S.-L. (2018). Communication-efficient on-device machine learning: Federated distillation and augmentation under non-iid private data. arXiv preprint arXiv:1811.11479.

Jimenez G., D. M., Solans, D., Heikkilä, M. A., Vitaletti, A., Kourtellis, N., Anagnostopoulos, A., and Chatzigiannakis, I. (2024). Non-IID data in federated learning: A survey with taxonomy, metrics, methods, frameworks and future directions. arXiv preprint arXiv:2411.12377.

Karimireddy, S. P., Kale, S., Mohri, M., Reddi, S., Stich, S., and Suresh, A. T. (2020). Scaffold: Stochastic controlled averaging for federated learning. In International conference on machine learning, pages 5132–5143. PMLR.

Li, Q., He, B., and Song, D. (2021). Model-contrastive federated learning. In conference on computer vision and pattern recognition, pages 10713–10722.

Li, T., Sahu, A. K., Zaheer, M., Sanjabi, M., Talwalkar, A., and Smith, V. (2020). Federated optimization in heterogeneous networks. Machine learning and systems, 2:429–450.

McMahan, B., Moore, E., Ramage, D., Hampson, S., and Agüera y Arcas, B. (2017). Communication-efficient learning of deep networks from decentralized data. In 20th International Conference on Artificial Intelligence and Statistics, pages 1273–1282.

Mhou, K. and Senekane, M. (2026). HAPI—FedProx: Heterogeneity—aware adaptive proximal optimization for federated learning. Springer Nature Link.

Mora, A. et al. (2024). Knowledge distillation in federated learning: a practical guide. In 33rd International Joint Conference on Artificial Intelligence.

Qin, L. et al. (2024). Knowledge distillation in federated learning: a survey on long lasting challenges and new solutions. arXiv preprint arXiv:2406.10861.

Rodríguez-Barroso, N. et al. (2024). An overview of implementing security and privacy in federated learning. Artificial Intelligence Review, 57.

Wang, J., Liu, Q., Liang, H., Joshi, G., and Poor, H. V. (2021). A novel framework for the analysis and design of heterogeneous federated learning. IEEE Transactions on Signal Processing, 69:5234–5249.

Yan, Y., Feng, C.-M., Ye, M., Zuo, W., Li, P., Goh, R. S. M., Zhu, L., and Chen, C. (2023). Rethinking client drift in federated learning: A logit perspective. arXiv preprint arXiv:2308.10162.

ASTRA: Adaptive Student-Teacher Method for Robust Aggregation and Client Drift Reduction in Federated Learning

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)