Adaptive Model Switching for Dynamic Inference Serving in the IoT-edge-cloud Continuum
Resumo
Although Internet of Things (IoT) devices have enabled Deep Learning (DL) applications to operate closer to end-users, they struggle to scale with modern, computationally heavy DL algorithms due to their limited processing power. A promising solution is to offload computation to remote edge and cloud servers throughout the IoT-edge-cloud continuum. However, inconsistent network conditions, variable latency, and the distinct accuracy and precision requirements of different DL applications hinder the ability to uphold necessary performance standards. To address these issues, we propose AdaptSwitch, a dynamic framework that selects both the DL model and the most suitable execution location across the IoT-edge-cloud continuum, based on the application’s model accuracy and latency needs according to network conditions at a given moment. We evaluated AdaptSwitch using real IoT, edge, and cloud devices running state-of-the-art DL models for image classification while considering network delay data from real-world 5G/Edge/Cloud traffic datasets. Experimental results demonstrate that our approach improves Quality of Experience (QoE) by reducing inference latency while ensuring accuracy requirements are met through adaptive model switching.
Palavras-chave:
Adaptation models, Cloud computing, Accuracy, Computational modeling, Switches, Internet of Things, Servers, Quality of experience, Modeling, Load modeling, DL, Model, Switching, Inference, Serving, Offloading, Continuum, IoT, Edge, Cloud
Publicado
24/11/2025
Como Citar
KNORST, Tiago; JORDAN, Michael G.; KOROL, Guilherme; RUTZIG, Mateus Beck; BECK, Antonio Carlos Schneider.
Adaptive Model Switching for Dynamic Inference Serving in the IoT-edge-cloud Continuum. In: SIMPÓSIO BRASILEIRO DE ENGENHARIA DE SISTEMAS COMPUTACIONAIS (SBESC), 15. , 2025, Campinas/SP.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 85-90.
ISSN 2237-5430.
