Anomaly Detection for Infrastructure Key Performance Indicators under Real-Time Constraints
Resumo
The increasing complexity of IT infrastructure systems, such as data centers, cloud services, and distributed computing platforms, makes continuous monitoring of Key Performance Indicators (KPIs) indispensable. These indicators provide critical signals about system health, enabling the identification of anomalous behaviors that may precede failures, performance degradation, or service outages. While time-series anomaly detection has been widely studied, few works provide systematic comparative evaluations tailored to infrastructure KPIs under real-time constraints. This paper benchmarks seven unsupervised methods—AutoRegressive Integrated Moving Average (ARIMA), Long Short-Term Memory (LSTM), Autoencoder, Variational Autoencoder (VAE), One-Class Support Vector Machine (SVM), Kernel Density Estimation (KDE), and Isolation Forest—on univariate series from the Numenta Anomaly Benchmark (NAB) and Yahoo Webscope. We group series with Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) to enable cluster-wise analysis and evaluate using point-wise metrics with temporal tolerance, event-level Intersection over Union (IoU), and alert-delay statistics. All methods are run under a causal, sliding-window protocol that approximates streaming conditions without look-ahead. Results show that LSTM achieves the most balanced performance, while ARIMA is an effective statistical baseline under short windows. Reconstruction and density-based methods underperform in this regime, reflecting sensitivity to window size and distribution shifts. These findings emphasize the importance of evaluation protocols aligned with deployment conditions, while also highlighting the need for systematic hyperparameter sensitivity studies and real-time simulations to assess scalability and adaptability in streaming environments.
Palavras-chave:
Support vector machines, Systematics, Sensitivity, Protocols, Key performance indicator, Benchmark testing, Real-time systems, Long short term memory, Anomaly detection, Monitoring, anomaly detection, time series, infrastructure monitoring, key performance indicators, benchmarking
Publicado
28/10/2025
Como Citar
SCOPARO, Maynara Natalia; BRUSCHI, Sarita Mazzini.
Anomaly Detection for Infrastructure Key Performance Indicators under Real-Time Constraints. In: WORKSHOP ON CLOUD COMPUTING (WCC) - INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD), 37. , 2025, Bonito/MS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 100-107.
