Detecção de malware metamórfico baseada na indexação de grafos de dependência de dados
Abstract
Metamorphism and code mutation have been used successfully by malware writers to generate obfuscated codes without altering the original features, making them more difficult to detect. This work presents an approach for identifying metamorphic malware through extraction of characteristics on Data Dependency Graphs, to construct a classification index that is able to quickly and accurately recognize if a certain suspicious code belongs to a family of malware. Experimental results on 3045 metamorphic virus samples show higher average accuracy rates than most commercial antiviruses.References
Ahmadi, M., Sami, A., Rahimi, H., and Yadegari, B. (2013). Malware detection by behavioural sequential patterns. Computer Fraud & Security, 2013(8):11–19.
Alam, S., Sogukpinar, I., Traore, I., and Nigel Horspool, R. (2015a). Sliding window and control flow weight for metamorphic malware detection.
Alam, S., Traore, I., and Sogukpinar, I. (2015b). Annotated Control Flow Graph for Metamorphic Malware Detection. The Computer Journal, 58(10):2608–2621.
AV-Test (2015). AV-Test 2015 Security Report.
Capstone-Disassembler (2017). Capstone disassembler github repository. https://github.com/aquynh/capstone.
Cooper, K. D., Harvey, T. J., and Kennedy, K. (2004). Iterative data-flow analysis, revisited. Technical report.
Eskandari, M. and Hashemi, S. (2012). A graph mining approach for detecting unknown malwares. Journal of Visual Languages & Computing, 23(3):154–162.
Ferrante, J., Ottenstein, K. J., and Warren, J. D. (1987). The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (TOPLAS), 9(3):319–349.
Garey, M. R. and Johnson, D. S. (1990). Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York, NY, USA.
Hu, X., Chiueh, T.-c., and Shin, K. G. (2009). Large-scale malware indexing using function-call graphs. In Proceedings of the 16th ACM conference on Computer and communications security, pages 611–620. ACM.
Kim, K. and Moon, B.-R. (2010). Malware detection based on dependency graph using hybrid genetic algorithm. Proceedings of the 12th annual conference on Genetic and evolutionary computation - GECCO ’10, page 1211.
Kotsiantis, S. B., Zaharakis, I., and Pintelas, P. (2007). Supervised machine learning: A review of classification techniques.
Kuriakose, J. and Vinod, P. (2014). Ranked linear discriminant analysis features for metamorphic malware detection. In 2014 IEEE International Advance Computing Conference (IACC), pages 112–117. IEEE.
Kuriakose, J. and Vinod, P. (2015). Unknown metamorphic malware detection: Modelling with fewer relevant features and robust feature selection techniques. IAENG International Journal of Computer Science, 42(2):139–151.
Lattner, C. and Adve, V. (2004). LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO’04), Palo Alto, California.
Lejska, K. (2017). X86 opcode and instruction reference.
Lin, D. and Stamp, M. (2010). Hunting for undetectable metamorphic viruses. Journal in Computer Virology, 7(3):201–214.
Malshare (2017). Public repository of malware of the malshare project. http://malshare.com/about.php.
Martins, G. B., Santos, P., Danrley, V., Souto, E., and Freitas, R. D. (2016). Identificação de Códigos Maliciosos Metamórficos pela Medição do Nível de Similaridade de Grafos de Dependência. In Anais do XVI Simpósio Brasileiro de Segurança da Informação e de Sistemas Computacionais, pages 296–309.
Munson, M. A. and Caruana, R. (2009). On Feature Selection, Bias-Variance, and Bagging.
Nguyen Anh Quynh, C. (2014). Capstone: next generation disassembly framework. http://www.capstone-engine.org/BHUSA2014-capstone.pdf.
Paredes, R. and Chávez, E. (2005). Using the k-nearest neighbor graph for proximity searching in metric spaces. In International Symposium on String Processing and Information Retrieval, pages 127–138. Springer.
Rad, B. B., Masrom, M., and Ibrahim, S. (2012). Opcodes histogram for classifying metamorphic portable executables malware. In 2012 International Conference on ELearning and E-Technologies in Education, ICEEE 2012, pages 209–213. IEEE.
Radare2 (2017). Radare2 github repository. https://github.com/radare/radare2.
Raymond, J. W. and Willett, P. (2002). Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of computer-aided molecular design, 16(7):521–533.
Singh, T., Di Troia, F., Corrado, V. A., Austin, T. H., and Stamp, M. (2015). Support vector machines and malware detection. Journal of Computer Virology and Hacking Techniques.
Symantec (2017). Symantec 2017 internet security threat report.
Total, V. (2017). Virustotal-free online virus, malware and url scanner. Online: https://www.virustotal.com/en.
VXHeaven (2017). Computer virus collection. URL: http://vxheaven.org/vl.php.
Alam, S., Sogukpinar, I., Traore, I., and Nigel Horspool, R. (2015a). Sliding window and control flow weight for metamorphic malware detection.
Alam, S., Traore, I., and Sogukpinar, I. (2015b). Annotated Control Flow Graph for Metamorphic Malware Detection. The Computer Journal, 58(10):2608–2621.
AV-Test (2015). AV-Test 2015 Security Report.
Capstone-Disassembler (2017). Capstone disassembler github repository. https://github.com/aquynh/capstone.
Cooper, K. D., Harvey, T. J., and Kennedy, K. (2004). Iterative data-flow analysis, revisited. Technical report.
Eskandari, M. and Hashemi, S. (2012). A graph mining approach for detecting unknown malwares. Journal of Visual Languages & Computing, 23(3):154–162.
Ferrante, J., Ottenstein, K. J., and Warren, J. D. (1987). The program dependence graph and its use in optimization. ACM Transactions on Programming Languages and Systems (TOPLAS), 9(3):319–349.
Garey, M. R. and Johnson, D. S. (1990). Computers and Intractability; A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York, NY, USA.
Hu, X., Chiueh, T.-c., and Shin, K. G. (2009). Large-scale malware indexing using function-call graphs. In Proceedings of the 16th ACM conference on Computer and communications security, pages 611–620. ACM.
Kim, K. and Moon, B.-R. (2010). Malware detection based on dependency graph using hybrid genetic algorithm. Proceedings of the 12th annual conference on Genetic and evolutionary computation - GECCO ’10, page 1211.
Kotsiantis, S. B., Zaharakis, I., and Pintelas, P. (2007). Supervised machine learning: A review of classification techniques.
Kuriakose, J. and Vinod, P. (2014). Ranked linear discriminant analysis features for metamorphic malware detection. In 2014 IEEE International Advance Computing Conference (IACC), pages 112–117. IEEE.
Kuriakose, J. and Vinod, P. (2015). Unknown metamorphic malware detection: Modelling with fewer relevant features and robust feature selection techniques. IAENG International Journal of Computer Science, 42(2):139–151.
Lattner, C. and Adve, V. (2004). LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation. In Proceedings of the 2004 International Symposium on Code Generation and Optimization (CGO’04), Palo Alto, California.
Lejska, K. (2017). X86 opcode and instruction reference.
Lin, D. and Stamp, M. (2010). Hunting for undetectable metamorphic viruses. Journal in Computer Virology, 7(3):201–214.
Malshare (2017). Public repository of malware of the malshare project. http://malshare.com/about.php.
Martins, G. B., Santos, P., Danrley, V., Souto, E., and Freitas, R. D. (2016). Identificação de Códigos Maliciosos Metamórficos pela Medição do Nível de Similaridade de Grafos de Dependência. In Anais do XVI Simpósio Brasileiro de Segurança da Informação e de Sistemas Computacionais, pages 296–309.
Munson, M. A. and Caruana, R. (2009). On Feature Selection, Bias-Variance, and Bagging.
Nguyen Anh Quynh, C. (2014). Capstone: next generation disassembly framework. http://www.capstone-engine.org/BHUSA2014-capstone.pdf.
Paredes, R. and Chávez, E. (2005). Using the k-nearest neighbor graph for proximity searching in metric spaces. In International Symposium on String Processing and Information Retrieval, pages 127–138. Springer.
Rad, B. B., Masrom, M., and Ibrahim, S. (2012). Opcodes histogram for classifying metamorphic portable executables malware. In 2012 International Conference on ELearning and E-Technologies in Education, ICEEE 2012, pages 209–213. IEEE.
Radare2 (2017). Radare2 github repository. https://github.com/radare/radare2.
Raymond, J. W. and Willett, P. (2002). Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of computer-aided molecular design, 16(7):521–533.
Singh, T., Di Troia, F., Corrado, V. A., Austin, T. H., and Stamp, M. (2015). Support vector machines and malware detection. Journal of Computer Virology and Hacking Techniques.
Symantec (2017). Symantec 2017 internet security threat report.
Total, V. (2017). Virustotal-free online virus, malware and url scanner. Online: https://www.virustotal.com/en.
VXHeaven (2017). Computer virus collection. URL: http://vxheaven.org/vl.php.
Published
2017-11-06
How to Cite
AGUILERA, Luis Rojas; SOUTO, Eduardo; MARTINS, Gilbert Breves.
Detecção de malware metamórfico baseada na indexação de grafos de dependência de dados. In: BRAZILIAN SYMPOSIUM ON CYBERSECURITY (SBSEG), 17. , 2017, Brasília.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2017
.
p. 264-277.
DOI: https://doi.org/10.5753/sbseg.2017.19505.
