Revisiting Gradient Staleness: Evaluating Distance Metrics for Asynchronous Federated Learning Aggregation

  • Patrick Wilhelm Technische Universität Berlin
  • Odej Kao Technische Universität Berlin

Resumo


In asynchronous federated learning (FL), client devices send updates to a central server at varying times based on their computational speed, often using stale versions of the global model. This staleness can degrade the convergence and accuracy of the global model. Previous work, such as AsyncFedED, proposed an adaptive aggregation method using Euclidean distance to measure staleness. In this paper, we extend this approach by exploring alternative distance metrics to more accurately capture the effect of gradient staleness. We integrate these metrics into the aggregation process and evaluate their impact on convergence speed, model performance, and training stability under heterogeneous clients and non-IID data settings. Our results demonstrate that certain metrics lead to more robust and efficient asynchronous FL training, offering a stronger foundation for practical deployment.
Palavras-chave: Measurement, Training, Adaptation models, Federated learning, Computational modeling, High performance computing, Stability analysis, Heterogeneous networks, Servers, Convergence, Federated Learning, Distributed Computing, Cloud Computing, Edge AI, Asynchronous communication
Publicado
28/10/2025
WILHELM, Patrick; KAO, Odej. Revisiting Gradient Staleness: Evaluating Distance Metrics for Asynchronous Federated Learning Aggregation. In: WORKSHOP ON CLOUD COMPUTING (WCC) - INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 37. , 2025, Bonito/MS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 77-83.