Comparative Study of Different Feature Sets for Intrusion Detection in Computer Networks

  • André Oliveira de Alcântara UFU
  • Rodrigo S. Miani UFU
  • Elaine Ribeiro de Faria UFU

Abstract


Intrusion detection in computer networks is a fundamental task to ensure information security in digital systems. However, the effectiveness of intrusion detection systems depends on the data quality and the selection of attributes used to train classification models. This study aims to examine how different attribute sets impact intrusion detection in data streams. We conducted an experimental analysis using the CICIDS2017 database, comparing various attribute sets from existing literature along with a set we propose. We also considered different delays in obtaining true labels. Our findings suggest that while delayed labels affect classifier performance, the choice of attribute set used in training also plays a significant role.
Keywords: intrusion detection, attribute set, data streams

References

Amudha, P. et al. (2013). Classification techniques for intrusion detection-an overview. International Journal of Computer Applications, 76(16).

Bhuyan, M. H. et al. (2017). Network anomaly detection: Methods, systems and tools. IEEE Communications Surveys & Tutorials, 16(1):303–336.

Bifet, A., Holmes, G., and Pfahringer, B. (2010). Leveraging bagging for evolving data streams. In Proceedings of the 2010th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I, pages 135–150. Springer.

Breve, F. and Zhao, L. (2013). Semi-supervised learning with concept drift using particle dynamics applied to network intrusion detection data. In 2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence, pages 335–340. Ieee.

Check Point Research (2023). Cyber security report 2023.

da Costa, V. G. T., Zarpelão, B. B., Miani, R. S., and Junior, S. B. (2018). Online detection of botnets on network flows using stream mining. In Anais do XXXVI Simpósio Brasileiro de Redes de Computadores e Sistemas Distribuídos, pages 225–238. SBC.

Fayyad, U. M. et al. (1996). From data mining to knowledge discovery in databases. AI magazine, 17(3):37.

Garcia, S., Grill, M., Stiborek, J., and Zunino, A. (2014). An empirical comparison of botnet detection methods. Computers & Security, 45:100–123.

Hernández-Pereira, E., Suárez-Romero, J. A., Fontenla-Romero, O., and Alonso-Betanzos, A. (2009). Conversion methods for symbolic features: A comparison applied to an intrusion detection problem. Expert Systems with Applications, 36(7):10612–10617.

Molina, D. R. et al. (2020). A survey of data mining and knowledge discovery process models and methodologies. Journal of Computing Sciences in Colleges, 35(5):62–80.

Nwagu, I. et al. (2017). Knowledge discovery from databases: An overview. Journal of Applied Sciences and Environmental Management, 21(5):887–893.

Olímpio, G., Camargos, L., Miani, R. S., and Faria, E. R. (2023). Model update for intrusion detection: Analyzing the performance of delayed labeling and active learning strategies. Computers & Security, 134:103451.

Prasath, S., Sethi, K., Mohanty, D., Bera, P., and Samantaray, S. R. (2022). Analysis of continual learning models for intrusion detection system. IEEE Access, 10:121444–121464.

Ribeiro, G. H., de Faria Paiva, E. R., and Miani, R. S. (2020). A comparison of stream mining algorithms on botnet detection. In Proceedings of the 15th International Conference on Availability, Reliability and Security, ARES ’20, New York, NY, USA. Association for Computing Machinery.

Schuartz, F. C., Munaretto, A., and Fonseca, M. (2019). Uma comparação entre os sistemas de detecção de ameaças distribuídas de rede baseado no processamento de dados em fluxo e em lotes. In Anais do XXIV Workshop de Gerência e Operação de Redes e Serviços, pages 29–42. SBC.

Shao, Z., Yuan, S., and Wang, Y. (2021). Adaptive online learning for IoT botnet detection. Information Sciences, 574:84–95.

Sharafaldin, I., Lashkari, A. H., and Ghorbani, A. A. (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116.

Sousa, W. T. M. and Silva, C. A. (2022). Análise de desempenho em algoritmos de aprendizagem de máquina na detecção de intrusão baseada em fluxo de rede usando o conjunto de dados unsw-nb15. Revista de Sistemas e Computação-RSC, 12(2).

Stolfo, S., Fan, W., Lee, W., Prodromidis, A., and Philip, C. (1999). KDD Cup 1999 Data. UCI Machine Learning Repository. DOI: 10.24432/C51C7N.

Torres, J. L. G., Catania, C. A., and Veas, E. (2019). Active learning approach to label network traffic datasets. Journal of Information Security and Applications, 49:102388.

Wang, W., Guyet, T., Quiniou, R., Cordier, M.-O., Masseglia, F., and Zhang, X. (2014). Autonomic intrusion detection: Adaptively detecting anomalies over unlabeled audit data streams in computer networks. Knowledge-Based Systems, 70:103–117.

Yuan, X., Wang, R., Zhuang, Y., Zhu, K., and Hao, J. (2018). A concept drift based ensemble incremental learning approach for intrusion detection. In 2018 IEEE International Conference on Internet of things (IThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPS-Com) and IEEE Smart Data (SmartData), pages 350–357. IEEE.
Published
2024-11-17
ALCÂNTARA, André Oliveira de; MIANI, Rodrigo S.; FARIA, Elaine Ribeiro de. Comparative Study of Different Feature Sets for Intrusion Detection in Computer Networks. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 21. , 2024, Belém/PA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 239-250. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2024.245047.

Most read articles by the same author(s)