An Enhanced Seasonal-Hybrid ESD Technique for Robust Anomaly Detection on Time Series
Resumo
Nowadays, time series data underlies countless research activities. Despite the wide range of techniques to capture and process all this information, issues such as analyzing large amounts of data and detecting unusual behaviors on them still pose a great challenge. In this context, this paper suggests SHESD+, a statistical technique that combines the Extreme Studentized Deviate (ESD) test and a decomposition procedure based on Loess to detect anomalies on time series data. The proposed technique employs robust metrics to identify anomalies in a more proper and accurate manner, even in the presence of trend and seasonal spikes. Simulation studies are carried out to evaluate the effectiveness of the SH-ESD+ using the published Numenta Anomaly Benchmark (NAB) collection. Computational results show that the SH-ESD+ performs consistently when compared against state-of-the-art and classic detection techniques.
Referências
Ahmad, S., Lavin, A., Purdy, S., and Agha, Z. (2017). Unsupervised real-time anomaly detection for streaming data. Neurocomputing, 262(1):134–147.
Ahmad, S. and Purdy, S. (2016). Real-time anomaly detection for streaming analytics. arXiv preprint arXiv:1607.02480.
Akoglu, L., Tong, H., and Koutra, D. (2015). Graph based anomaly detection and description: a survey. Data Mining and Knowledge Discovery, 29(3):626–688.
Akouemo, H. N. and Povinelli, R. J. (2016). Probabilistic anomaly detection in natural gas time series data. International Journal of Forecasting, 32(3):948–956.
Bianco, A. M., Garcia Ben, M., Martinez, E., and Yohai, V. J. (2001). Outlier Detection in Regression Models with ARIMA Errors using Robust Estimates. Journal of Forecasting, 20(8):565–579.
Box, G. E., Jenkins, G. M., Reinsel, G. C., and Ljung, G. M. (2015). Time Series Analysis: Forecasting and Control. John Wiley & Sons (5th edition).
Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, pages 211–252.
Burnaev, E. and Ishimtsev, V. (2016). Conformalized density-and distance-based anomaly detection in time-series data. arXiv preprint arXiv:1608.04585 (2016).
Chandola, V., Banerjee, A., and Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3):15–87.
Chou, J. S. and Telaga, A. S. (2014). Real-time detection of anomalous power consumption. Renewable and Sustainable Energy Reviews, 33:400–411.
Cleveland, R. B., Cleveland, W. S., McRae, J. E., and Terpenning, I. (1990). STL: A Seasonal-Trend Decomposition Procedure Based on Loess. Journal of Ofcial Statistics, 6(1):3–73.
Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots.
Journal of the American Statistical Association, 74(368):829–836.
Grubbs, F. E. (1950). Sample Criteria for Testing Outlying Observations. The Annals of Mathematical Statistics, 21:27–58.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag (2nd edition).
Hyndman, R. J. and Khandakar, Y. (2007). Automatic time series for forecasting: the forecast package for R. Monash University, Department of Econometrics and Business Statistics.
Johansen, S. and Juselius, K. (1990). Maximum likelihood estimation and inference on cointegrationwith applications to the demand for money. Oxford Bulletin of Economics and statistics, 52(2):169–210.
Kadri, F., Harrou, F., Chaabane, S., Sun, Y., and Tahon, C. (2016). Seasonal ARMAbased SPC charts for anomaly detection: Application to emergency department systems. Neurocomputing, 173(15):2102–2114.
Laptev, N., Amizadeh, S., and Flint, I. (2015). Generic and scalable framework for automated time-series anomaly detection. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1939–1947.
Lavin, A. and Ahmad, S. (2015). Evaluating Real-Time Anomaly Detection Algorithms – The Numenta Anomaly Benchmark. In Proceedings of the 14th IEEE International Conference on Machine Learning and Applications, pages 38–44.
Laxhammar, R. and Falkman, G. (2014). Online learning and sequential anomaly detection in trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6):1158–1173.
Li, W., Mahadevan, V., and Vasconcelos, N. (2014). Anomaly detection and localization in crowded scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(1):18–32.
Lu, D., Mausel, P., Brondizio, E., and Moran, E. (2004). Change detection techniques. International Journal of Remote Sensing, 25(12):2365–2401.
Montgomery, D. C. and Runger, G. C. (2013). Applied Statistics and Probability for Engineers. John Wiley & Sons (6th edition).
Moshtaghi, M., Bezdek, J. C., Leckie, C., and Palaniswami, M. (2015). Evolving fuzzy IEEE Transactions on Fuzzy Systems, rules for anomaly detection in data streams. 23(3):688–700.
Patcha, A. and Park, J.-M. (2007). An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer Networks, 51(12):3448–3470.
Poirier, D. J. (1973). Piecewise regression using cubic splines. Journal of the American Statistical Association, 68(343):515–524.
Rosner, B. (1975). On the Detection of Many Outliers. Technometrics, 17(2):221–227.
Snedecor, G. W. and Cochran, W. G. (1989). Statistical Methods. Iowa State University Press (8th edition).
Stanway, A. (2013). Etsy skyline. https://github.com/etsy/skyline.
Theissler, A. (2017). Detecting known and unknown faults in automotive systems using ensemble-based anomaly detection. Knowledge-Based Systems, 123:163–173.
Welch, P. (1967). The use of fast fourier transform for the estimation of power spectra: a method based on time averaging over short, modied periodograms. IEEE Transactions on Audio and Electroacoustics, 15(2):70–73.
Witten, I. H., Frank, E., and Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (3rd edition).
Zhou, S., Shen, W., Zeng, D., Fang, M., and Zhang, Z. (2016). Spatial–temporal convolutional neural networks for anomaly detection and localization in crowded scenes. Signal Processing: Image Communication, 47:358–368.