HSAE: A Hybrid Unsupervised Autoencoder for Zero-Day Attack Detection in Realistic Environments
Resumo
Research Context: Anomaly detection in computer networks is essential for cybersecurity, especially in complex environments like corporate and IoT networks that face emerging threats. Scientific and/or Practical Problem: Traditional Intrusion Detection Systems (IDS) struggle to identify zero-day attacks, creating significant security gaps. Signature-based approaches fail against unknown threats, while supervised models require large volumes of labeled data rarely available in practice. Unsupervised methods often produce high false positive rates (FPR), limiting their operational viability. Proposed Solution and/or Analysis: This work proposes HSAE (Hybrid Scoring Autoencoder), a lightweight hybrid deep learning model for network anomaly detection. The solution combines autoencoder reconstruction error with an auxiliary output trained using pseudo-labels, enabling the model to learn normality patterns exclusively from benign traffic. Related IS Theory: The research is grounded in anomaly detection theory via machine learning, integrating unsupervised approaches. The model uses Autoencoder (AE) architecture and its principles of data compression and reconstruction to identify behavioral deviations. Research Method: The study conducted quantitative and empirical evaluation, training and testing the model on public datasets CICIDS2017 and ToN_IoT. HSAE performance was compared to Variational Autoencoder (VAE), the state-of-the-art, using metrics including AUC, precision, recall, F1-Score, and FPR. Summary of Results: HSAE consistently outperformed VAE, achieving AUC of 0.94 in corporate scenarios and 0.86 in IoT ransomware detection. Results demonstrated the model’s utility as a practical tool for high-precision anomaly identification with significantly lower false positive rates. Contributions and Impact to IS area: This work contributes to Information Systems by presenting an anomaly detection model tailored for realistic scenarios, offering practical approaches to managing zero-day threats. Its computational efficiency enables implementation in resource-constrained environments like IoT and industrial systems where security is critical.
Referências
Ahmad, R., Alsmadi, I., Alhamdani, W., and Tawalbeh, L. (2023). Zero-day attack detection: a systematic literature review. Artificial Intelligence Review, 56(10):10733–10811.
Alkasassbeh, M. and Al-Haj Baddar, S. (2023). Intrusion detection systems: A state-of-the-art taxonomy and survey. Arabian Journal for Science and Engineering, 48(8):10021–10064.
Alrayes, F. S., Zakariah, M., Amin, S. U., Khan, Z. I., and Helal, M. (2024). Intrusion detection in IoT systems using denoising autoencoder. IEEE Access.
Bank, D., Koenigstein, N., and Giryes, R. (2023). Autoencoders. Machine learning for data science handbook: data mining and knowledge discovery handbook, pages 353–374.
Beaman, C., Barkworth, A., Akande, T. D., Hakak, S., and Khan, M. K. (2021). Ransomware: Recent advances, analysis, challenges and future research directions. Computers & security, 111:102490.
Berahmand, K., Daneshfar, F., Salehi, E. S., Li, Y., and Xu, Y. (2024). Autoencoders and their applications in machine learning: a survey. Artificial Intelligence Review, 57(2):28.
Cheng, J.-M. and Wang, H.-C. (2004). A method of estimating the equal error rate for automatic speaker verification. In 2004 International Symposium on Chinese Spoken Language Processing, pages 285–288. IEEE.
De Neira, A. B., Kantarci, B., and Nogueira, M. (2023). Distributed denial of service attack prediction: Challenges, open issues and opportunities. Computer Networks, 222:109553.
Guo, Y. (2023). A review of machine learning-based zero-day attack detection: Challenges and future directions. Computer communications, 198:175–185.
Hajj, S., El Sibai, R., Bou Abdo, J., Demerjian, J., Makhoul, A., and Guyeux, C. (2021). Anomaly-based intrusion detection systems: The requirements, methods, measurements, and datasets. Transactions on Emerging Telecommunications Technologies, 32(4):e4240.
Halvorsen, J., Izurieta, C., Cai, H., and Gebremedhin, A. (2024). Applying generative machine learning to intrusion detection: A systematic mapping study and review. ACM Computing Surveys, 56(10):1–33.
Lu, H., Zhao, Y., Song, Y., Yang, Y., He, G., Yu, H., and Ren, Y. (2024). A transfer learning-based intrusion detection system for zero-day attack in communication-based train control system. Cluster Computing, 27(6):8477–8492.
Mbona, I. and Eloff, J. H. (2022). Detecting zero-day intrusion attacks using semi-supervised machine learning approaches. IEEE Access, 10:69822–69838.
Moustafa, N. (2021). A new distributed architecture for evaluating AI-based security systems at the edge: Network TON_IoT datasets. Sustainable Cities and Society, 72:102994.
Narayan, S. (1997). The generalized sigmoid activation function: Competitive supervised learning. Information sciences, 99(1-2):69–82.
Oluwadare, S. and ElSayed, Z. (2023). A survey of unsupervised learning algorithms for zero-day attacks in intrusion detection systems. In The International FLAIRS Conference Proceedings, volume 36.
Pratiwi, H., Windarto, A. P., Susliansyah, S., Aria, R. R., Susilowati, S., Rahayu, L. K., Fitriani, Y., Merdekawati, A., and Rahadjeng, I. R. (2020). Sigmoid activation function in selecting the best model of artificial neural networks. In Journal of physics: conference series, volume 1471, page 012010. IOP Publishing.
Rainio, O., Teuho, J., and Klén, R. (2024). Evaluation metrics and statistical tests for machine learning. Scientific Reports, 14(1):6086.
Razaulla, S., Fachkha, C., Markarian, C., Gawanmeh, A., Mansoor, W., Fung, B. C., and Assi, C. (2023). The age of ransomware: A survey on the evolution, taxonomy, and research directions. IEEE Access, 11:40698–40723.
Sharafaldin, I., Lashkari, A. H., Ghorbani, A. A., et al. (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018).
Verkerken, M., D’hooge, L., Sudyana, D., Lin, Y.-D., Wauters, T., Volckaert, B., and De Turck, F. (2023). A novel multi-stage approach for hierarchical intrusion detection. IEEE Transactions on Network and Service Management, 20(3):3915–3929.
Yang, Z., Liu, X., Li, T., Wu, D., Wang, J., Zhao, Y., and Han, H. (2022). A systematic literature review of methods and datasets for anomaly-based network intrusion detection. Computers & Security, 116:102675.
Zavrak, S. and Iskefiyeli, M. (2020). Anomaly-based intrusion detection from network flow features using variational autoencoder. IEEE Access, 8:108346–108358.
Zoppi, T., Ceccarelli, A., and Bondavalli, A. (2021). Unsupervised algorithms to detect zero-day attacks: Strategy and application. IEEE Access, 9:90603–90615.
