Weakly Supervised Video Anomaly Detection Combining Deep Features with Shallow Neural Networks





Anomaly detection, Video surveillance, Multiple Instance Learning, I3D features, Shallow Neural Networks


Deep features have outgrown hand-sketched features in many applications. The availability of pre-trained deep feature extractors helps to overcome one of the deep learning main drawbacks, which is the need for large volumes of data for training. Multiple Instance Learning (MIL) has become an attractive solution for video surveillance literature once it allows working with weakly labeled bases. This work evaluates a video anomaly detection approach based on the MIL paradigm combining deep features with shallow Neural Networks. For computational efficiency, we apply Principal Component Analysis (PCA) for dimensionality reduction before classification. We performed the experiments from a set of I3D (Inflated 3D) features, which corresponds to the ShanghaiTech benchmark dataset, and the MLP and SVM shallow classifiers achieved competitive results.


Download data is not yet available.


Al-Dhamari, A., Sudirman, R., and Mahmood, N. H. (2020). Transfer deep learning along with binary support vector machine for abnormal behavior detection. IEEE Access, 8:61085-61095. DOI: 10.1109/ACCESS.2020.2982906.

Ali, S. and Shah, M. (2008). Human action recognition in videos using kinematic features and multiple instance learning. IEEE transactions on pattern analysis and machine intelligence, 32(2):288-303. DOI: 10.1109/TPAMI.2008.284.

Amraee, S., Vafaei, A., Jamshidi, K., and Adibi, P. (2018). Abnormal event detection in crowded scenes using one-class svm. Signal, Image and Video Processing, 12(6):1115-1123. DOI: 10.1007/s11760-018-1267-z.

Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8):861 - 874. ROC Analysis in Pattern Recognition. DOI: 10.1016/j.patrec.2005.10.010.

Jabeen, S., Saleem, S., Azam, A., and Khan, U. G. (2019). Scene recognition of surveillance data using deep features and supervised classifiers. In 2019 2nd International Conference on Advancements in Computational Sciences (ICACS), pages 1-6. IEEE. DOI: 10.23919/ICACS.2019.8689001.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An introduction to statistical learning, volume 112. Springer. DOI: 10.1007/978-1-4614-7138-7.

Kamoona, A. M., Gosta, A. K., Bab-Hadiashar, A., and Hoseinnezhad, R. (2020). Multiple instance-based video anomaly detection using deep temporal encoding-decoding. arXiv preprint arXiv:2007.01548. DOI: 10.1016/j.eswa.2022.119079.

Li, T., Wang, Z., Liu, S., and Lin, W.-Y. (2021). Deep unsupervised anomaly detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3636–3645.

Luo, W., Liu, W., and Gao, S. (2017). A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE International Conference on Computer Vision, pages 341-349.

Nayak, R., Pati, U. C., and Das, S. K. (2020). A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing, page 104078. DOI: 10.1016/j.imavis.2020.104078.

Otani, M., Nakashima, Y., Rahtu, E., Heikkilä, J., and Yokoya, N. (2016). Video summarization using deep semantic features. In Asian Conference on Computer Vision, pages 361-377. Springer. DOI: 10.1007/978-3-319-54193-8_23.

Pawar, K. and Attar, V. (2019). Deep learning approaches for video-based anomalous activity detection. World Wide Web, 22(2):571-601. DOI: 10.1007/s11280-018-0582-1.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830.

Pereira, S. S. and Maia, J. B. (2021). Uma abordagem baseada em redes neurais, multiple instance learning e pca para detecção de anomalias em videovigilância. In Anais do XLVIII Seminário Integrado de Software e Hardware, pages 123-130. SBC. DOI: 10.5753/semish.2021.15814.

Perera, P. and Patel, V. M. (2019). Learning deep features for one-class classification. IEEE Transactions on Image Processing, 28(11):5450-5463. DOI: 10.1109/TIP.2019.2917862.

Petty, M. D. (2012). Calculating and using confidence intervals for model validation. In Proceedings of the Fall 2012 Simulation Interoperability Workshop, pages 10–14.

Rao, T. N., Girish, G., and Rajan, J. (2017). An improved contextual information based approach for anomaly detection via adaptive inference for surveillance application. In Proceedings of International Conference on Computer Vision and Image Processing, pages 133-147. Springer. DOI: 10.1007/978-981-10-2104-6_13.

Ribeiro, M., Lazzaretti, A. E., and Lopes, H. S. (2018). A study of deep convolutional auto-encoders for anomaly detection in videos. Pattern Recognition Letters, 105:13-22. DOI: 10.1016/j.patrec.2017.07.016.

Roshtkhari, M. J. and Levine, M. D. (2013). An on-line, real-time learning method for detecting anomalies in videos using spatio-temporal compositions. Computer vision and image understanding, 117(10):1436-1452.

Saha, B. N., Ray, N., and Zhang, H. (2009). Snake validation: A pca-based outlier detection method. IEEE Signal Processing Letters, 16(6):549-552. DOI: 10.1109/LSP.2009.2017477.

Shidik, G. F., Noersasongko, E., Nugraha, A., Andono, P. N., Jumanto, J., and Kusuma, E. J. (2019). A systematic review of intelligence video surveillance: Trends, techniques, frameworks, and datasets. IEEE Access, 7:170457-170473. DOI: 10.1109/ACCESS.2019.2955387.

Stathakis, D. (2009). How many hidden layers and nodes? International Journal of Remote Sensing, 30(8):2133-2147. DOI: 10.1080/01431160802549278.

Suarez, J. J. P. and Naval Jr, P. C. (2020). A survey on deep learning techniques for video anomaly detection. arXiv preprint arXiv:2009.14146. DOI: 10.48550/arXiv.2009.14146.

Sultani, W., Chen, C., and Shah, M. (2018). Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6479–6488.

Tian, Y., Pang, G., Chen, Y., Singh, R., Verjans, J. W., and Carneiro, G. (2021). Weakly-supervised video anomaly detection with contrastive learning of long and short-range temporal features. arXiv preprint arXiv:2101.10030. DOI: 10.1109/ICCV48922.2021.00493.

Ullah, W., Ullah, A., Haq, I. U., Muhammad, K., Sajjad, M., and Baik, S. W. (2021). Cnn features with bi-directional lstm for real-time anomaly detection in surveillance networks. Multimedia Tools and Applications, 80(11):16979-16995. DOI: 10.1007/s11042-020-09406-3.

Wan, B., Fang, Y., Xia, X., and Mei, J. (2020). Weakly supervised video anomaly detection via center-guided discriminative learning. In 2020 IEEE International Conference on Multimedia and Expo (ICME), pages 1-6. IEEE. DOI: 10.1109/ICME46284.2020.9102722.

Wu, X. and Kumar, V. (2009). The top ten algorithms in data mining. CRC press.

Zhao, H., Lai, Z., Leung, H., and Zhang, X. (2020). Feature Learning and Understanding: Algorithms and Applications. Springer Nature. DOI: 10.1007/978-3-030-40794-0.

Zhong, J.-X., Li, N., Kong, W., Liu, S., Li, T. H., and Li, G. (2019). Graph convolutional label noise cleaner: Train a plug-and-play action classifier for anomaly detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1237-1246.




How to Cite

Pereira, S. S. L., & José Everardo Bessa Maia. (2022). Weakly Supervised Video Anomaly Detection Combining Deep Features with Shallow Neural Networks. Journal of the Brazilian Computer Society, 28(1), 69–79. https://doi.org/10.5753/jbcs.2022.2194