Otimização de hiperparâmetros da DroidAugmentor para geração de dados sintéticos de malware Android
Resumo
Neste trabalho, realizamos um estudo abrangente sobre o impacto da otimização de hiperparâmetros, como taxa de dropout e número de camadas ocultas, na ferramenta DroidAugmentor, com foco no aumento de 10 datasets distintos demalware Android. Avaliamos os efeitos dessa otimização nas métricas de utilidade, similaridade e no consumo de recursos computacionais, como CPU e memória. Os resultados confirmam que ajustes precisos de hiperparâmetros são essenciais para maximizar a qualidade dos dados gerados, otimizando também a eficiência no uso de recursos computacionais.
Palavras-chave:
Malware Android, Geração de dados sintéticos, Otimização de hiperparâmetros, cGAN, Data Augmentation
Referências
Barcellos, L. V. ADBuilder: uma ferramenta de construção de datasets para detecção de malwares Android. Universidade Federal do Pampa, 2023.
Rocha, V. et al. Amgenerator e amexplorer: Geraçao de metadados e construçao de datasets android. In: SBC. ANAIS Estendidos do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais. 2023. P. 41–48.
Chlap, P. et al. A review of medical image data augmentation techniques for deep learning applications. Journal of Medical Imaging and Radiation Oncology, Wiley Online Library, v. 65, n. 5, p. 545–563, 2021.
Xu, L. et al.Modeling Tabular Data Using Conditional GAN. Advances in NIPS, v. 32, 2019.
Rajabi, A.; Garibay, O. O. TabfairGAN: : Fair Tabular Data Generation with Generative Adversarial Networks. ML and Knowledge Extraction, MDPI, v. 4, n. 2, p. 488, 2022.
Casola, K. et al. Droidaugmentor: uma ferramenta de treinamento e avaliação de cgans para geração de dados sintéticos. In: SBC. ANAIS Estendidos do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais. 2023. P. 57–64.
Weerts, H. J.; Mueller, A. C.; Vanschoren, J. Importance of tuning hyperparameters of machine learning algorithms. arXiv preprint arXiv:2007.07588, 2020.
Goodfellow, I. et al. Generative adversarial nets. Advances in neural information processing systems, v. 27, 2014.
Kurach, K. et al. A large-scale study on regularization and normalization in GANs. In: PMLR. INTERNATIONAL conference on machine learning. 2019. P. 3581–3590.
Sabiri, B.; El Asri, B.; Rhanoui, M. Effect of Convulsion Layers and Hyper-parameters on the Behavior of Adversarial Neural Networks. In: SPRINGER. INTERNATIONAL Conference on Enterprise Information Systems. 2022. P. 222–245.
Tam, G.; Hunter, A. Machine learning to identify Android malware. In: IEEE. 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). 2018. P. 1–5.
Alarsan, F. I.; Younes, M. Best selection of generative adversarial networks hyper-parameters using genetic algorithm. SN Computer Science, Springer, v. 2, n. 4, p. 283, 2021.
Minarno, A. E. et al. Convolutional neural network with hyperparameter tuning for brain tumor classification. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 2021.
Saranyaraj, D.; Manikandan, M.; Maheswari, S. A deep convolutional neural network for the early detection of breast carcinoma with respect to hyper-parameter tuning. Multimedia Tools and Applications, Springer, v. 79, n. 15-16, p. 11013–11038, 2020.
Van Den Hoogen, J. et al. Hyperparameter analysis of wide-kernel cnn architectures in industrial fault detection: an exploratory study. International Journal of Data Science and Analytics, Springer, p. 1–22, 2023.
Antunes, A. et al. Hyperparameter optimization of a convolutional neural network model for pipe burst location in water distribution networks. Journal of Imaging, MDPI, v. 9, n. 3, p. 68, 2023.
Li, D. et al. Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In: IEEE. 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications (SustainCom)(BDCloud-SocialCom-SustainCom). 2016. P. 477–484.
Hutter, F.; Lücke, J.; Schmidt-Thieme, L. Beyond manual tuning of hyperparameters. KI-Künstliche Intelligenz, Springer, v. 29, p. 329–337, 2015.
Putatunda, S.; Rama, K. A modified bayesian optimization based hyper-parameter tuning approach for extreme gradient boosting. In: IEEE. 2019 Fifteenth International Conference on Information Processing (ICINPRO). 2019. P. 1–6.
Kiran, M.; Ozyildirim, M. Hyperparameter tuning for deep reinforcement learning applications. arXiv preprint arXiv:2201.11182, 2022.
Dhake, H.; Kashyap, Y.; Kosmopoulos, P. Algorithms for hyperparameter tuning of lstms for time series forecasting. Remote Sensing, MDPI, v. 15, n. 8, p. 2076, 2023.
Mantovani, R. G. et al. A meta-learning recommender system for hyperparameter tuning: Predicting when tuning improves SVM classifiers. Information Sciences, Elsevier, v. 501, p. 193–221, 2019.
Karamchandani, A. et al. A methodological framework for optimizing the energy consumption of deep neural networks: a case study of a cyber threat detector. Neural Computing and Applications, Springer, v. 36, n. 17, p. 10297–10338, 2024.
Seybold, C. et al. Dropout-GAN: Learning from a Dynamic Ensemble of Discriminators. arXiv preprint arXiv:1807.11346, 2018.
Rocha, V. et al. Amgenerator e amexplorer: Geraçao de metadados e construçao de datasets android. In: SBC. ANAIS Estendidos do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais. 2023. P. 41–48.
Chlap, P. et al. A review of medical image data augmentation techniques for deep learning applications. Journal of Medical Imaging and Radiation Oncology, Wiley Online Library, v. 65, n. 5, p. 545–563, 2021.
Xu, L. et al.Modeling Tabular Data Using Conditional GAN. Advances in NIPS, v. 32, 2019.
Rajabi, A.; Garibay, O. O. TabfairGAN: : Fair Tabular Data Generation with Generative Adversarial Networks. ML and Knowledge Extraction, MDPI, v. 4, n. 2, p. 488, 2022.
Casola, K. et al. Droidaugmentor: uma ferramenta de treinamento e avaliação de cgans para geração de dados sintéticos. In: SBC. ANAIS Estendidos do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais. 2023. P. 57–64.
Weerts, H. J.; Mueller, A. C.; Vanschoren, J. Importance of tuning hyperparameters of machine learning algorithms. arXiv preprint arXiv:2007.07588, 2020.
Goodfellow, I. et al. Generative adversarial nets. Advances in neural information processing systems, v. 27, 2014.
Kurach, K. et al. A large-scale study on regularization and normalization in GANs. In: PMLR. INTERNATIONAL conference on machine learning. 2019. P. 3581–3590.
Sabiri, B.; El Asri, B.; Rhanoui, M. Effect of Convulsion Layers and Hyper-parameters on the Behavior of Adversarial Neural Networks. In: SPRINGER. INTERNATIONAL Conference on Enterprise Information Systems. 2022. P. 222–245.
Tam, G.; Hunter, A. Machine learning to identify Android malware. In: IEEE. 2018 9th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON). 2018. P. 1–5.
Alarsan, F. I.; Younes, M. Best selection of generative adversarial networks hyper-parameters using genetic algorithm. SN Computer Science, Springer, v. 2, n. 4, p. 283, 2021.
Minarno, A. E. et al. Convolutional neural network with hyperparameter tuning for brain tumor classification. Kinetik: Game Technology, Information System, Computer Network, Computing, Electronics, and Control, 2021.
Saranyaraj, D.; Manikandan, M.; Maheswari, S. A deep convolutional neural network for the early detection of breast carcinoma with respect to hyper-parameter tuning. Multimedia Tools and Applications, Springer, v. 79, n. 15-16, p. 11013–11038, 2020.
Van Den Hoogen, J. et al. Hyperparameter analysis of wide-kernel cnn architectures in industrial fault detection: an exploratory study. International Journal of Data Science and Analytics, Springer, p. 1–22, 2023.
Antunes, A. et al. Hyperparameter optimization of a convolutional neural network model for pipe burst location in water distribution networks. Journal of Imaging, MDPI, v. 9, n. 3, p. 68, 2023.
Li, D. et al. Evaluating the energy efficiency of deep convolutional neural networks on CPUs and GPUs. In: IEEE. 2016 IEEE international conferences on big data and cloud computing (BDCloud), social computing and networking (SocialCom), sustainable computing and communications (SustainCom)(BDCloud-SocialCom-SustainCom). 2016. P. 477–484.
Hutter, F.; Lücke, J.; Schmidt-Thieme, L. Beyond manual tuning of hyperparameters. KI-Künstliche Intelligenz, Springer, v. 29, p. 329–337, 2015.
Putatunda, S.; Rama, K. A modified bayesian optimization based hyper-parameter tuning approach for extreme gradient boosting. In: IEEE. 2019 Fifteenth International Conference on Information Processing (ICINPRO). 2019. P. 1–6.
Kiran, M.; Ozyildirim, M. Hyperparameter tuning for deep reinforcement learning applications. arXiv preprint arXiv:2201.11182, 2022.
Dhake, H.; Kashyap, Y.; Kosmopoulos, P. Algorithms for hyperparameter tuning of lstms for time series forecasting. Remote Sensing, MDPI, v. 15, n. 8, p. 2076, 2023.
Mantovani, R. G. et al. A meta-learning recommender system for hyperparameter tuning: Predicting when tuning improves SVM classifiers. Information Sciences, Elsevier, v. 501, p. 193–221, 2019.
Karamchandani, A. et al. A methodological framework for optimizing the energy consumption of deep neural networks: a case study of a cyber threat detector. Neural Computing and Applications, Springer, v. 36, n. 17, p. 10297–10338, 2024.
Seybold, C. et al. Dropout-GAN: Learning from a Dynamic Ensemble of Discriminators. arXiv preprint arXiv:1807.11346, 2018.
Publicado
27/11/2024
Como Citar
NOGUEIRA, Angelo Gaspar Diniz; OLIVEIRA, Lucas Ferreira Areais de; SILVA, Anna Luiza Gomes da; KREUTZ, Diego; MANSILHA, Rodrigo Brandão.
Otimização de hiperparâmetros da DroidAugmentor para geração de dados sintéticos de malware Android. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 21. , 2024, Rio Grande/RS.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 141-147.
DOI: https://doi.org/10.5753/errc.2024.4676.