Aprimorando a Detecção de APTs com Geração de Dados Sintéticos Baseada em GAN-Transformers

Alfredo Cossetin Neto; Rafel C. Pregardier; Carlos R. P. dos Santos; Vinicius Fulber-Garcia; Luis A. L. Silva

doi:10.5753/sbseg.2025.11378

Alfredo Cossetin Neto UFSM
Rafel C. Pregardier UFSM
Carlos R. P. dos Santos UFSM
Vinicius Fulber-Garcia UFPR
Luis A. L. Silva UFSM

DOI: https://doi.org/10.5753/sbseg.2025.11378

Resumo

Este trabalho investiga a geração de dados sintéticos de Advanced Persistent Threats (APTs) utilizando Redes Generativas Adversariais (GANs) adaptadas para o domínio de séries temporais. Considerando a natureza furtiva e sequencial das APTs, abordagens tradicionais de geração de dados que ignoram as dinâmicas temporais tornam-se insuficientes. Para superar essa limitação, este estudo explora a arquitetura Transformer Time-Series Conditional GAN (TTS-CGAN), originalmente proposta para biosinais, e propõe adaptações específicas para a geração de fluxos de rede maliciosos. O processo inclui a modelagem de dados do dataset DAPT2020, ajustes arquiteturais para aumento da capacidade e diversidade, além da validação dos dados sintéticos por meio de métricas qualitativas, quantitativas e do desempenho de modelos de Machine Learning (ML) treinados em conjuntos reais, sintéticos e semi-sintéticos. Os resultados indicam que o uso de dados sintéticos gerados pela TTS-CGAN pode aprimorar a detecção de APTs, demonstrando a viabilidade e os benefícios da abordagem proposta.

Referências

Alo, S. O., Jamil, A. S., Hussein, M. J., Al-Dulaimi, M. K. H., Taha, S. W., e Khlaponina, A. (2024). Automated detection of cybersecurity threats using generative adversarial networks (GANs). In 2024 36th Conference of Open Innovations Association (FRUCT), pages 566–577. IEEE.

Alshamrani, A., Myneni, S., Chowdhary, A., e Huang, D. (2019). A survey on advanced persistent threats: Techniques, solutions, challenges, and research opportunities. IEEE Communications Surveys & Tutorials, 21(2):1851–1877.

Alzahem, A., Boulila, W., Driss, M., Koubaa, A., e Almomani, I. (2022). Towards optimizing malware detection: An approach based on generative adversarial networks and transformers. In Nguyen, N. T., Manolopoulos, Y., Chbeir, R., Kozierkiewicz, A., e Trawiński, B., editors, Computational Collective Intelligence, volume 13501, pages 598–610. Springer International Publishing. Series Title: Lecture Notes in Computer Science.

Bianchi, L., Pregardier, R., Silva, L. A. L., e Santos, C. R. P. (2025). 2pack-gan: Exploring transfer learning to fine-tune generative adversarial networks for network packet generation. In NOMS 2025-2025 IEEE Network Operations and Management Symposium, pages 1–9. IEEE.

Brophy, E., Wang, Z., She, Q., e Ward, T. (2023). Generative adversarial networks in time series: A systematic literature review. ACM Computing Surveys, 55(10):1–31.

Chakraborty, T., KS, U. R., Naik, S. M., Panja, M., e Manvitha, B. (2024). Ten years of generative adversarial nets (gans): a survey of the state-of-the-art. Machine Learning: Science and Technology, 5(1):011001.

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations.

Esteban, C., Hyland, S. L., e Rätsch, G. (2017). Real-valued (medical) time series generation with recurrent conditional gans. arXiv preprint arXiv:1706.02633.

Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. "O’Reilly Media, Inc.".

Ghafir, I., Hammoudeh, M., Prenosil, V., Han, L., Hegarty, R., Rabie, K., e Aparicio-Navarro, F. J. (2018). Detection of advanced persistent threat using machine-learning correlation analysis. Future Generation Computer Systems, 89:349–359.

Ghafir, I., Kyriakopoulos, K. G., Lambotharan, S., Aparicio-Navarro, F. J., Assadhan, B., Binsalleeh, H., e Diab, D. M. (2019). Hidden markov models and alert correlations for the prediction of advanced persistent threats. IEEE Access, 7:99508–99520.

Harada, S., Hayashi, H., e Uchida, S. (2019). Biosignal generation and latent variable analysis with recurrent generative adversarial networks. IEEE Access, 7:144292–144302.

Hazra, D. e Byun, Y. C. (2020). Synsiggan: Generative adversarial networks for synthetic biomedical signal generation. Biology (Basel), 9(12):441.

Hudson, D. A. e Zitnick, L. (2021). Generative adversarial transformers. In International conference on machine learning, pages 4487–4499. PMLR.

Jiang, Y., Chang, S., e Wang, Z. (2021). TransGAN: two pure transformers can make one strong GAN, and that can scale up. In Proceedings of the 35th International Conference on Neural Information Processing Systems, NIPS ’21, pages 14745–14758. Curran Associates Inc.

Kumar, A., Kuppusamy, K., e Aghila, G. (2019). A learning model to detect maliciousness of portable executable using integrated feature set. Journal of King Saud University - Computer and Information Sciences, 31(2):252–265.

Li, D., Chen, D., Goh, J., e Ng, S.-k. (2018). Anomaly detection with generative adversarial networks for multivariate time series. arXiv preprint arXiv:1809.04758.

Li, D., Chen, D., Jin, B., Shi, L., Goh, J., e Ng, S.-K. (2019). Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks. In International conference on artificial neural networks, pages 703–716. Springer.

Li, X., Metsis, V., Wang, H., e Ngu, A. H. H. (2022). Tts-gan: A transformer-based time-series generative adversarial network. In International conference on artificial intelligence in medicine, pages 133–143. Springer.

Liao, N., Wang, J., Guan, J., e Fan, H. (2024). A multi-step attack identification and correlation method based on multi-information fusion. Computers and Electrical Engineering, 117:109249.

Lippmann, R. P., Fried, D. J., Graf, I., Haines, J. W., Kendall, K. R., McClung, D., Weber, D., Webster, S. E., Wyschogrod, D., Cunningham, R. K., et al. (2000). Evaluating intrusion detection systems: The 1998 darpa off-line intrusion detection evaluation. In Proceedings DARPA Information survivability conference and exposition. DISCEX’00, volume 2, pages 12–26. IEEE.

Moustafa, N. e Slay, J. (2015). UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). In 2015 Military Communications and Information Systems Conference (MilCIS), pages 1–6. IEEE.

Myneni, S., Chowdhary, A., Sabur, A., Sengupta, S., Agrawal, G., Huang, D., e Kang, M. (2020). DAPT 2020 - constructing a benchmark dataset for advanced persistent threats. In Wang, G., Ciptadi, A., e Ahmadzadeh, A., editors, Deployable Machine Learning for Security Defense, volume 1271, pages 138–163. Springer International Publishing. Series Title: Communications in Computer and Information Science.

Myneni, S., Jha, K., Sabur, A., Agrawal, G., Deng, Y., Chowdhary, A., e Huang, D. (2023). Unraveled—a semi-synthetic dataset for advanced persistent threats. Computer Networks, 227:109688.

Navarro, J., Deruyver, A., e Parrend, P. (2018). A systematic survey on multi-step attack detection. Computers & Security, 76:214–249.

Navidan, H., Moshiri, P. F., Nabati, M., Shahbazian, R., Ghorashi, S. A., Shah-Mansouri, V., e Windridge, D. (2021). Generative adversarial networks (gans) in networking: A comprehensive survey & evaluation. Computer Networks, 194:108149.

Sharafaldin, I., Habibi Lashkari, A., e Ghorbani, A. A. A detailed analysis of the CICIDS2017 data set. In Mori, P., Furnell, S., e Camp, O., editors, Information Systems Security and Privacy, volume 977, pages 172–188. Springer International Publishing. Series Title: Communications in Computer and Information Science.

Shiravi, A., Shiravi, H., Tavallaee, M., e Ghorbani, A. A. (2012). Toward developing a systematic approach to generate benchmark datasets for intrusion detection. computers & security, 31(3):357–374.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, U., e Polosukhin, I. (2017). Attention is All you Need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.

Xin, Y., Kong, L., Liu, Z., Chen, Y., Li, Y., Zhu, H., Gao, M., Hou, H., e Wang, C. (2018). Machine learning and deep learning methods for cybersecurity. Ieee access, 6:35365–35381.

Xiong, C., Zhu, T., Dong, W., Ruan, L., Yang, R., Cheng, Y., Chen, Y., Cheng, S., e Chen, X. (2020). Conan: A practical real-time apt detection system with high accuracy and efficiency. IEEE Transactions on Dependable and Secure Computing, 19(1):551–565.

Yoon, J., Jarrett, D., e Van der Schaar, M. (2019). Time-series generative adversarial networks. Advances in neural information processing systems, 32.

Zeeshan, M. e Maasooma (2024). Trans-GAN: A deep learning paradigm for multi-type anomaly detection in network traffic. In 2024 International Conference on Frontiers of Information Technology (FIT), pages 1–6. IEEE.

Zhou, P., Zhou, G., Wu, D., e Fei, M. (2021). Detecting multi-stage attacks using sequence-to-sequence model. Computers & Security, 105:102203.

Zhu, F., Ye, F., Fu, Y., et al. (2019a). Electrocardiogram generation with a bidirectional lstm-cnn generative adversarial network. Scientific reports, 9(1):6734.

Zhu, G., Zhao, H., Liu, H., e Sun, H. (2019b). A novel lstm-gan algorithm for time series anomaly detection. In 2019 Prognostics and System Health Management Conference (PHM-Qingdao), pages 1–6.

Aprimorando a Detecção de APTs com Geração de Dados Sintéticos Baseada em GAN-Transformers

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)