K8s-DT: A Kubernetes Digital Twin Based on Stochastic Models

  • Iúre de Sousa Fé UFPI
  • Luiz Fernando Bittencourt UNICAMP
  • Paulo Maciel UFPE
  • Francisco Airton Silva UFPI

Abstract


Microservice-based applications rely on autoscaling to balance cost and quality of service; however, reactive policies such as Kubernetes HPA tend to add replicas too late and depend on thresholds that are often tuned through trial and error. This delay increases response time during workload peaks and wastes resources when the load subsides. To address this gap, we couple a Digital Twin based on Stochastic Petri Nets (SPN) with the Kubernetes cluster and conduct SLA-driven what-if analyses focused on explicit response-time constraints. Our goal is to dynamically apply replica configurations that satisfy the SLA while minimizing resource consumption. We first validate the predictions produced by the Digital Twin against a real cloud environment with 95% confidence. Then, we develop two experimental scenarios with distinct workload profiles, in which the Digital Twin meets the SLA while reducing the average number of replicas by 32,83% (high-variance workload) and 18,7% (random workload) when compared to the best HPA baselines. Overall, our approach reduces resource usage without compromising SLA compliance and eliminates the need for workload-dependent threshold calibration in HPA.

References

Abdelrahman, M., Macatulad, E., Lei, B., Quintana, M., Miller, C., and Biljecki, F. (2025). What is a digital twin anyway? deriving the definition for the built environment from over 15,000 scientific publications. Building and Environment, 274:112748.

Ahmad, H., Treude, C., Wagner, M., and Szabo, C. (2024). Smart hpa: A resource-efficient horizontal pod auto-scaler for microservice architectures. In 2024 IEEE 21st International Conference on Software Architecture (ICSA), pages 1–12.

AWS, t. (2025). AWS instance types. [link]. Accessed: 2025-05-02.

Bittencourt, L. F., Braghetto, K. R., Cordeiro, D., and Sakellariou, R. (2024). On digital twins for cloud continuum applications. In International Conference on the Economics of Grids, Clouds, Systems, and Services, pages 286–293. Springer.

Borsatti, D., Zaccarini, M., Matteucci, M., et al. (2024). Kubetwin: A digital twin framework for kubernetes deployments at scale. IEEE Transactions on Network and Service Management, 21(4):3889–3903.

da Silva, L. M. D., Alves, P. V. A., Silva, S. N., and Fernandes, M. A. C. (2025). Adaptive horizontal scaling in kubernetes clusters with ann-based load forecasting. Cluster Computing.

DeGlopper, D. R. (1992). The art of computer systems performance analysis: Techniques for experimental design, measurement, simulation and modeling. by raj jain. new york: John wiley and sons, 1991. pp. 720. (hardcover). International Journal of Legal Information, 20(1):63–64.

Fé, I., Nguyen, T. A., Choi, E., Min, D., Lee, J.-W., Barbosa, V., Soares, A., Rego, P. A., Mei, A., and Silva, F. A. (2025). Energy-efficient performance optimization in kubernetes microservices using generalized stochastic petri net. Journal of Network and Computer Applications, page 104287.

Hyndman, R. J. and Athanasopoulos, G. (2018). Forecasting: principles and practice. OTexts.

Lakshan, S. and Hussain, S. (2025). A review of ai-driven techniques for cost optimization in kubernetes environments. In 2025 International Research Conference on Smart Computing and Systems Engineering (SCSE), pages 1–5. IEEE.

LLC, G. (2025). Como entender o core web vitals e os resultados da pesquisa google. Última atualização: 4 de agosto de 2025.

Locust, t. (2015). Locust an open source load testing tool. [link]. Accessed: 2025-05-02.

Maciel, P. R. M. (2023a). Performance, reliability, and availability evaluation of computational systems, Volume 2: Reliability, availability modeling, measuring, and data analysis. Chapman and Hall/CRC.

Maciel, P. R. M. (2023b). Performance, reliability, and availability evaluation of computational systems, volume I: performance and background. Chapman and Hall/CRC.

Nah, F. F.-H. (2004). A study on tolerable waiting time: how long are web users willing to wait? Behaviour & Information Technology, 23(3):153–163.

Pinheiro, T., Oliveira, D., Matos, R., Silva, B., Pereira, P., Melo, C., Oliveira, F., Tavares, E., Dantas, J., and Maciel, P. (2021). The mercury environment: a modeling tool for performance and dependability evaluation. In Intelligent Environments 2021, pages 16–25. IOS Press.

Pozdniakova, O., Mažeika, D., and Cholomskis, A. (2024). Sla-adaptive threshold adjustment for a kubernetes horizontal pod autoscaler. Electronics, 13(7):1242.

Schlegel, B., Gemulla, R., and Lehner, W. (2009). K-ary search on modern processors. In Proceedings of the Fifth International Workshop on Data Management on New Hardware, pages 52–60.

Vasumathi, M., Sadasivan, M., Kumar, V. V. N. P., Kumar, B. K., et al. (2025a). Ai-driven predictive auto-scaling for efficient microservices deployment in cloud data centers using lstm and kubernetes hpa. In 2025 5th International Conference on Expert Clouds and Applications (ICOECA), pages 859–864. IEEE.

Vasumathi, M. T., Sadasivan, M., Asha, V., Phanindra Kumar, V. V. N., Kumar, B. K., and Veeresh (2025b). Ai-driven predictive auto-scaling for efficient microservices deployment in cloud data centers using lstm and kubernetes hpa. In 2025 5th International Conference on Expert Clouds and Applications (ICOECA), pages 1–6.

Veeck, C. H., Barbosa, M., and Dias, K. L. (2025). Reagir ou antecipar? uma comparação entre hpa e ml para balanceamento de carga. In XLIII Simpósio Brasileiro de Telecomunicações e Processamento de Sinais.
Published
2026-05-25
FÉ, Iúre de Sousa; BITTENCOURT, Luiz Fernando; MACIEL, Paulo; SILVA, Francisco Airton. K8s-DT: A Kubernetes Digital Twin Based on Stochastic Models. In: BRAZILIAN SYMPOSIUM ON COMPUTER NETWORKS AND DISTRIBUTED SYSTEMS (SBRC), 44. , 2026, Praia do Forte/BA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2026 . p. 884-897. ISSN 2177-9384. DOI: https://doi.org/10.5753/sbrc.2026.19307.

Most read articles by the same author(s)

<< < 1 2