Bifocal Agent: identificando automaticamente funções maliciosas para aumentar o foco do analista de malware
Resumo
Embora existam diversas soluções de detecção automática de componentes maliciosos, a análise de malware ainda é um processo realizado predominantemente de forma manual, tendo como gargalo o analista humano. Trabalhos recentes foram capazes de identificar regiões suspeitas no código, reduzindo o esforço do analista. Entretanto, tais soluções são baseadas em assinaturas ou geram muitos falsos positivos. Para superar esse desafio, propomos o Bifocal Agent, que atua em dois níveis de granularidade distintos (função e bloco básico) e utiliza novas features para melhorar a detecção de funções maliciosas. Em experimentos, a solução aumentou em 17% a área sob a curva ROC do estado-da-arte e reduziu em mais de um terço os falsos positivos.Referências
Alrawi, O., Ike, M., Pruett, M., Kasturi, R. P., Barua, S., Hirani, T., Hill, B., and Saltaformaggio, B. (2021). Forecasting malware capabilities from cyber attack memory images. In 30th USENIX Security Symposium (USENIX Security 21), pages 3523–3540. USENIX Association.
Andriesse, D., Slowinska, A., and Bos, H. (2017). Compiler-agnostic function detection in binaries. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P), pages 177–189.
Coscia, A., Dentamaro, V., Galantucci, S., Maci, A., and Pirlo, G. (2023). Yamme: a yara-byte-signatures metamorphic mutation engine. IEEE Transactions on Information Forensics and Security, 18:4530–4545.
David, O. E. and Netanyahu, N. S. (2015). Deepsign: Deep learning for automatic malware signature generation and classification. In 2015 International Joint Conference on Neural Networks (IJCNN), pages 1–8.
Downing, E., Mirsky, Y., Park, K., and Lee, W. (2021). DeepReflect: Discovering malicious functionality through binary reconstruction. In 30th USENIX Security Symposium (USENIX Security 21), pages 3469–3486. USENIX Association.
Gutman, Y. (2019). Stop the churn, avoid burnout: How to keep your cybersecurity personnel. [link]. Accessed: 2024-09-30.
Jones, L., Sellers, A., and Carlisle, M. (2016). Cardinal: similarity analysis to defeat malware compiler variations. In 2016 11th International Conference on Malicious and Unwanted Software (MALWARE), pages 1–8.
Kaspersky (2023). Kaspersky Security Bulletin 2022. Statistics — securelist.com. [link]. [Acessado em 20-Maio-2024].
Lester, M. (2021). Pe malware machine learning dataset. [link]. Accessed: 2024-09-30.
Li, S., Ming, J., Qiu, P., Chen, Q., Liu, L., Bao, H., Wang, Q., and Jia, C. (2023). Packge-nome: Automatically generating robust yara rules for accurate malware packer detection. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, CCS ’23, page 3078–3092, New York, NY, USA. Association for Computing Machinery.
Molloy, C., Charland, P., Ding, S. H. H., and Fung, B. C. M. (2022). Jarv1s: Phenotype clone search for rapid zero-day malware triage and functional decomposition for cyber threat intelligence. In 2022 14th International Conference on Cyber Conflict: Keep Moving! (CyCon), volume 700, pages 385–403.
Novkovic, I. and Groš, S. (2016). Can malware analysts be assisted in their work using techniques from machine learning? In 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pages 1408–1413.
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., and Nicholas, C. (2017). Malware detection by eating a whole exe.
Royal, P., Halpin, M., Dagon, D., Edmonds, R., and Lee, W. (2006). Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In 2006 22nd Annual Computer Security Applications Conference (ACSAC’06), pages 289–300.
Ruaro, N., Pagani, F., Ortolani, S., Kruegel, C., and Vigna, G. (2022). Symbexcel: Automated analysis and understanding of malicious excel 4.0 macros. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1066–1081.
Yong Wong, M., Landen, M., Antonakakis, M., Blough, D. M., Redmiles, E. M., and Ahamad, M. (2021). An inside look into the practice of malware analysis. CCS ’21, page 3053–3069, New York, NY, USA. Association for Computing Machinery.
Zhong, Y., Yamaki, H., Yamaguchi, Y., and Takakura, H. (2013). Ariguma code analyzer: Efficient variant detection by identifying common instruction sequences in malware families. In 2013 IEEE 37th Annual Computer Software and Applications Conference, pages 11–20.
Andriesse, D., Slowinska, A., and Bos, H. (2017). Compiler-agnostic function detection in binaries. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P), pages 177–189.
Coscia, A., Dentamaro, V., Galantucci, S., Maci, A., and Pirlo, G. (2023). Yamme: a yara-byte-signatures metamorphic mutation engine. IEEE Transactions on Information Forensics and Security, 18:4530–4545.
David, O. E. and Netanyahu, N. S. (2015). Deepsign: Deep learning for automatic malware signature generation and classification. In 2015 International Joint Conference on Neural Networks (IJCNN), pages 1–8.
Downing, E., Mirsky, Y., Park, K., and Lee, W. (2021). DeepReflect: Discovering malicious functionality through binary reconstruction. In 30th USENIX Security Symposium (USENIX Security 21), pages 3469–3486. USENIX Association.
Gutman, Y. (2019). Stop the churn, avoid burnout: How to keep your cybersecurity personnel. [link]. Accessed: 2024-09-30.
Jones, L., Sellers, A., and Carlisle, M. (2016). Cardinal: similarity analysis to defeat malware compiler variations. In 2016 11th International Conference on Malicious and Unwanted Software (MALWARE), pages 1–8.
Kaspersky (2023). Kaspersky Security Bulletin 2022. Statistics — securelist.com. [link]. [Acessado em 20-Maio-2024].
Lester, M. (2021). Pe malware machine learning dataset. [link]. Accessed: 2024-09-30.
Li, S., Ming, J., Qiu, P., Chen, Q., Liu, L., Bao, H., Wang, Q., and Jia, C. (2023). Packge-nome: Automatically generating robust yara rules for accurate malware packer detection. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, CCS ’23, page 3078–3092, New York, NY, USA. Association for Computing Machinery.
Molloy, C., Charland, P., Ding, S. H. H., and Fung, B. C. M. (2022). Jarv1s: Phenotype clone search for rapid zero-day malware triage and functional decomposition for cyber threat intelligence. In 2022 14th International Conference on Cyber Conflict: Keep Moving! (CyCon), volume 700, pages 385–403.
Novkovic, I. and Groš, S. (2016). Can malware analysts be assisted in their work using techniques from machine learning? In 2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), pages 1408–1413.
Raff, E., Barker, J., Sylvester, J., Brandon, R., Catanzaro, B., and Nicholas, C. (2017). Malware detection by eating a whole exe.
Royal, P., Halpin, M., Dagon, D., Edmonds, R., and Lee, W. (2006). Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In 2006 22nd Annual Computer Security Applications Conference (ACSAC’06), pages 289–300.
Ruaro, N., Pagani, F., Ortolani, S., Kruegel, C., and Vigna, G. (2022). Symbexcel: Automated analysis and understanding of malicious excel 4.0 macros. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1066–1081.
Yong Wong, M., Landen, M., Antonakakis, M., Blough, D. M., Redmiles, E. M., and Ahamad, M. (2021). An inside look into the practice of malware analysis. CCS ’21, page 3053–3069, New York, NY, USA. Association for Computing Machinery.
Zhong, Y., Yamaki, H., Yamaguchi, Y., and Takakura, H. (2013). Ariguma code analyzer: Efficient variant detection by identifying common instruction sequences in malware families. In 2013 IEEE 37th Annual Computer Software and Applications Conference, pages 11–20.
Publicado
16/09/2024
Como Citar
CHAHUD, Leonardo Gonçalves; ROCHA, Rafael Oliveira da; PEREIRA JR., Lourenço Alves; DRAGO, Idilio.
Bifocal Agent: identificando automaticamente funções maliciosas para aumentar o foco do analista de malware. In: SIMPÓSIO BRASILEIRO DE SEGURANÇA DA INFORMAÇÃO E DE SISTEMAS COMPUTACIONAIS (SBSEG), 24. , 2024, São José dos Campos/SP.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2024
.
p. 60-75.
DOI: https://doi.org/10.5753/sbseg.2024.241689.