A predictive approach for dynamic replication of operators in distributed stream processing systems

  • Daniel Wladdimiro Sorbonne University / INRIA / CNRS / LIP6
  • Luciana Arantes Sorbonne University / INRIA / CNRS / LIP6
  • Pierre Sens Sorbonne University / INRIA / CNRS / LIP6
  • Nicolas Hidalgo Universidad Diego Portales

Resumo


Stream Processing Systems (SPSs) can present significant fluctuation in input rate. To address this issue, some existing solutions propose reconfiguring the SPS by replicating its operators. However, such reconfiguration usually induces a high system downtime cost. Moreover, reconfiguration decisions are based only on resource utilization without balancing the load between replicas. We propose in this paper a predictive SPS that dynamically defines the necessary number of replicas of each operator based not only on the current resource utilization and input rate variation but also on the events that, due to the operator's overloading, could not be processed yet and are, thus, kept in the operator's queue. In addition, our SPS implements a load balancer that distributes incoming events more evenly among replicas of an operator. Our solution has been integrated into Storm. To avoid system reconfiguration downtime, our SPS preallocates a pool of replicas where each of them can be activated or deactivated based on per operator input load predictions. Using real traffic traces with different applications, we have conducted experiments on Google Cloud Platform (GCP), evaluating our SPS and comparing it with Storm and DABS-Storm.
Palavras-chave: Stream processing, Adaptive SPS, Predictive algorithm, Replication, Google Cloud Platform
Publicado
02/11/2022
WLADDIMIRO, Daniel; ARANTES, Luciana; SENS, Pierre; HIDALGO, Nicolas. A predictive approach for dynamic replication of operators in distributed stream processing systems. In: INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 34. , 2022, Bordeaux/France. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 120-129.