Impacto da Redução de Dimensão e Seleção de Atributos na Generalização de Modelos de Detecção de Intrusão

Kelson Carvalho Santos; Rodrigo Sanches Miani

doi:10.5753/sbrc.2025.6353

Kelson Carvalho Santos UFU / IFPI http://orcid.org/0000-0002-3644-5867
Rodrigo Sanches Miani UFU https://orcid.org/0000-0002-8176-8040

DOI: https://doi.org/10.5753/sbrc.2025.6353

Resumo

Trabalhos anteriores apresentam poucas iniciativas voltadas à melhoria da generalização de modelos de detecção de intrusão em cenários distintos de tráfego de rede. Na busca por soluções para esse problema, destaca-se o uso frequente da técnica de Redução de Dimensão. No entanto, análises comparativas entre essa técnica e outras abordagens semelhantes ainda são escassas. Este trabalho propõe a implementação e a análise comparativa da Redução de Dimensão e da Seleção de Atributos, com o objetivo de avaliar seus impactos como estratégias para melhorar a generalização de modelos de detecção de intrusão baseados em aprendizado de máquina. Os resultados obtidos indicam que ambas podem contribuir no enfrentamento desse desafio.

Palavras-chave: Aprendizado de Máquina, Detecção de Intrusão, Generalização de Modelos, Redução de Dimensão, Seleção de Atributos

Referências

Aouini, Z. and Pekar, A. (2022). Nfstream: A flexible network data analysis framework. Computer Networks, 204:1–8.

Cieslak, M. C., Castelfranco, A. M., Roncalli, V., Lenz, P. H., and Hartline, D. K. (2020). t-distributed stochastic neighbor embedding (t-sne): A tool for eco-physiological transcriptomic analysis. Marine Genomics, 51:1–12.

D’hooge, L., Verkerken, M., Wauters, T., De Turck, F., and Volckaert, B. (2023). Investigating generalized performance of data-constrained supervised machine learning models on novel, related samples in intrusion detection. Sensors, 23(4):1–39.

D’hooge, L., Wauters, T., Volckaert, B., and De Turck, F. (2020). Inter-dataset generalization strength of supervised machine learning methods for intrusion detection. Journal of Information Security and Applications, 54:1–13.

Jolliffe, I. T. and Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065):1–16.

Kenyon, A., Deka, L., and D., E. (2020). Are public intrusion datasets fit for purpose characterizing the state of the art in intrusion event datasets? Computers & Security, 99:1-26.

Khalid, S., Khalil, T., and Nasreen, S. (2014). A survey of feature selection and feature extraction techniques in machine learning. Science and Information Conference, London, UK, 27-29 Aug 2014.

Khraisat, A., Gondal, I., Vamplew, P., and Kamruzzaman, J. (2019). Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity, 2(20):1–22.

Layeghy, S. and Portmann, M. (2022). On generalisability of machine learning-based network intrusion detection systems. arXiv preprint arXiv:2205.04112.

Mahesh, B. (2020). Machine learning algorithms - A review. International Journal of Science and Research (IJSR), 9(1):381–386.

Mahfouz, A., Abuhussein, A., Venugopal, D., and Shiva, S. (2020). Ensemble classifiers for network intrusion detection using a novel network attack dataset. Future Internet, 12(11):1-19.

Marvi, M., Arfeen, A., and Uddin, R. (2021). A generalized machine learning-based model for the detection of DDoS attacks. International Journal of Network Management, 31(6):1-22.

Moustafa, N. and Slay, J. (2015). UNSW-NB15: A comprehensive data set for network intrusion detection systems. Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 10-12 Nov 2015.

Naseer, M., Rusdi, J. F., Shanono, N. M., Salam, S., Muslim, Z. B., Abu, N. A., and Abadi, I. (2021). Malware detection: Issues and challenges. Cybersecurity, 1807(1):1–6.

Obaid, H. S., Dheyab, S. A., and Sabry, S. S. (2019). The impact of data pre-processing techniques and dimensionality reduction on the accuracy of machine learning. 9th Annual Information Technology, Electromechanical Engineering and Microelectronics Conference (IEMECON), Jaipur, India, 13-15 Mar 2019.

Paulauskas, N. and Auskalnis, J. (2017). Analysis of data pre-processing influence on intrusion detection using NSL-KDD dataset.

Rocha, M. S., Bernardo, G. D., Mundim, L., Zarpelao, B. B., and Miani, R. S. (2023). Supervised machine learning and detection of unknown attacks: An empirical evaluation. AINA 2023: International Conference on Advanced Information Networking and Applications, Juiz de Fora, MG, Brazil, 29-31 Mar 2023.

Santos, K. C., Miani, R. S., and de Oliveira Silva, F. (2024). Evaluating the impact of data preprocessing techniques on the performance of intrusion detection systems. Journal of Network and Systems Management, 32(36):1–54.

Sharafaldin, I., Lashkari, A. H., and Ghorbani, A. A. (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. 4th International Conference on Information Systems Security and Privacy (ICISSP), Funchal, Madeira, Portugal, 22-24 Jan 2018.

Sudyana, D., Lin, Y., Verkerken, M., Hwang, R., Lai, Y., D’Hooge, L., Wauters, T., Volckaert, B., and De Turck, F. (2024). Improving generalization of ML-based IDS with lifecycle-based dataset, auto-learning features, and deep learning. IEEE Transactions on Machine Learning in Communications and Networking, 2:645–662.

Thaseen, I. S., Kumar, C. A., and Ahmad, A. (2019). Integrated intrusion detection model using chi-square feature selection and ensemble of classifiers. Arabian Journal for Science and Engineering, 44:3357–3368.

Verkerken, M., D’hooge, L., Wauters, T., Volckaert, B., and De Turck, F. (2021). Towards model generalization for intrusion detection: Unsupervised machine learning techniques. Journal of Network and Systems Management, 30(1):1–25.

Yudha, F. (2023). Cremev2: A toolchain of automatic dataset collection for machine learning in intrusion detection based on mitre att&ck. Disponível em: [link]. Accessado em 29 Jun 2024.

Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D., and Saeed, J. (2020). A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal of Applied Science and Technology Trends, 1(2):56–70.

Impacto da Redução de Dimensão e Seleção de Atributos na Generalização de Modelos de Detecção de Intrusão

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)