Rastreabilidade de Códigos Executáveis usando Redes Neurais
Abstract
Traceability of codes refers to the mapping between equivalent codes written in different languages – including high-level and low-level programming languages. In the field of Legal Metrology, it is critical to guarantee that the software binary code embedded in a meter corresponds to a program source code that was previously approved by the Legal Metrology Authority. In this paper, we propose a novel approach for correlating source and binary codes using artificial neural networks. Our approach correlates the source code with the binary code by feeding the neural network with logical flow characteristics of such codes. Any incidence of false positives is obviously a critical issue for software evaluation purposes. Our evaluation using real code examples shows a typical correspondence rate between 62% and 90% for the traceability of the binary codes with the very low rate of 4% false positives.References
Angulo, C., Ruiz, F., González, L., and Ortega, J. A. (2006). Multi-classification by using tri-class svm. Neural Processing Letters, 23(1):89–101.
Asadi, R., Mustapha, N., and Sulaiman, N. (2009). New supervisioned multi layer feed forward neural network model to accelerate classification with high accuracy.
European Journal of Scientific Research., 33(1):163–178.
Boccardo, D. R., Lakhotia, A., Manacero Jr, A., and Venable, M. (2009). Adapting call-string approach for x86 obfuscated binaries. In Simpósio Brasileiro em Seguran¸ca da Informa¸cão e de Sistemas Computacionais.
Burkard, J. (2010). C software. http://people.sc.fsu.edu/burkardt/. (Último acesso Junho 2010).
Buttle, D. L. (2001). Verification of Compiled Code. PhD thesis, University of York, UK.
Ciocoiu, I. B. (2002). Hybrid feedforward neural networks for solving classification problems. Neural Processing Letters., 16(1):81–91.
Flake, H. (2004). Structural comparison of executable objects. In Proc. of the Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA). IEEE Computer Society.
Hassan, A. E., Jiang, Z. M., and Holt, R. C. (1995). Source versus object code extraction for recovering software architecture. In WCRE ’05: Proceedings of the 12th Working Conference on Reverse Engineering, pages 67–76, Washington, DC, USA. IEEE Computer Society.
Hatton, L. (2005). Estimating source lines of code from object code. In Windows and Embedded Control Systems.
Haykin, S. (1998). Neural Networks: A Comprehensive Foundation. Prentice Hall.
Hertz, J. A., Krogh, A. S., and Palmer, R. G. (1991). Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City, CA, USA.
IdaPro (2010). Ida pro disassembler. http://www.hex-rays.com/idapro/. (Último acesso Junho 2010).
Lenic, M., Povalej, P., Kokol, P., and Cardoso, A. I. (2004). Using cellular automata to predict reliability of modules. In Proceeding (436) Software Engineering and Applications.
McDonald, J. (2010). Delphi falls prey. http://www.symantec.com/connect/blogs/delphifalls-prey. (Último acesso Junho 2010).
Men, H., Wu, Y., Gao, Y., Kou, Z., Xu, Z., and Yang, S. (2008). Application of support vector machine to heterotrophic bacteria colony recognition. In CSSE (1), pages 830–833.
Moler, C. B. (1980). MATLAB — an interactive matrix laboratory. Technical Report 369, University of New Mexico. Dept. of Computer Science.
Moretti, E., Chanteperdrix, G., and Osorio, A. (2001). New algorithms for controlflow graph structuring. In CSMR ’01: Proceedings of the Fifth European Conference on Software Maintenance and Reengineering, page 184, Washington, DC, USA. IEEE Computer Society.
Oh, J. (2009). Fight against 1-day exploits: Diffing binaries vs anti-diffing binaries. In Blackhat technical Security Conference.
Oliveira Cruz, A. J. (2010). C software. http://equipe.nce.ufrj.br/adriano/c/exemplos.htm. (Último acesso Junho 2010).
Poznyakoff, S. (2010). Gnu cflow. http://savannah.gnu.org/projects/cflow. (Último acesso Junho 2010).
Quinlan, D. and Panas, T. (2009). Source code and binary analysis of software defects. In CSIIRW ’09: Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research, pages 1–4, New York, NY, USA. ACM.
Reddy, C. S., Raju, K. V. S. V. N., Kumari, V. V., and Devi, G. L. (2007). Faultprone module prediction of a web application using artificial neural networks. In Proceeding (591) Software Engineering and Applications.
Thompson, K. (1984). Reflections on trusting trust. Commun. ACM, 27(8):761–763.
Wang, Z., Pierce, K., and McFarling, S. (2002). Bmat a binary matching tool for stale profile propagation. In The Journal of Instruction-Level Parallelism.
Zeng, H. and Rine, D. (2004). A neural network approach for software defects fix effort estimation. In IASTED Conf. on Software Engineering and Applications, pages 513–517.
Zhenga, J. (2007). Predicting software reliability with neural network ensembles. Expert Systems with Applications, (36):2116–2122.
Zhenga, J. (2009). A digital image encryption algorithm based on hyper-chaotic cellular neural network. Journal Fundamenta Informaticae.
Asadi, R., Mustapha, N., and Sulaiman, N. (2009). New supervisioned multi layer feed forward neural network model to accelerate classification with high accuracy.
European Journal of Scientific Research., 33(1):163–178.
Boccardo, D. R., Lakhotia, A., Manacero Jr, A., and Venable, M. (2009). Adapting call-string approach for x86 obfuscated binaries. In Simpósio Brasileiro em Seguran¸ca da Informa¸cão e de Sistemas Computacionais.
Burkard, J. (2010). C software. http://people.sc.fsu.edu/burkardt/. (Último acesso Junho 2010).
Buttle, D. L. (2001). Verification of Compiled Code. PhD thesis, University of York, UK.
Ciocoiu, I. B. (2002). Hybrid feedforward neural networks for solving classification problems. Neural Processing Letters., 16(1):81–91.
Flake, H. (2004). Structural comparison of executable objects. In Proc. of the Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA). IEEE Computer Society.
Hassan, A. E., Jiang, Z. M., and Holt, R. C. (1995). Source versus object code extraction for recovering software architecture. In WCRE ’05: Proceedings of the 12th Working Conference on Reverse Engineering, pages 67–76, Washington, DC, USA. IEEE Computer Society.
Hatton, L. (2005). Estimating source lines of code from object code. In Windows and Embedded Control Systems.
Haykin, S. (1998). Neural Networks: A Comprehensive Foundation. Prentice Hall.
Hertz, J. A., Krogh, A. S., and Palmer, R. G. (1991). Introduction to the Theory of Neural Computation. Addison-Wesley, Redwood City, CA, USA.
IdaPro (2010). Ida pro disassembler. http://www.hex-rays.com/idapro/. (Último acesso Junho 2010).
Lenic, M., Povalej, P., Kokol, P., and Cardoso, A. I. (2004). Using cellular automata to predict reliability of modules. In Proceeding (436) Software Engineering and Applications.
McDonald, J. (2010). Delphi falls prey. http://www.symantec.com/connect/blogs/delphifalls-prey. (Último acesso Junho 2010).
Men, H., Wu, Y., Gao, Y., Kou, Z., Xu, Z., and Yang, S. (2008). Application of support vector machine to heterotrophic bacteria colony recognition. In CSSE (1), pages 830–833.
Moler, C. B. (1980). MATLAB — an interactive matrix laboratory. Technical Report 369, University of New Mexico. Dept. of Computer Science.
Moretti, E., Chanteperdrix, G., and Osorio, A. (2001). New algorithms for controlflow graph structuring. In CSMR ’01: Proceedings of the Fifth European Conference on Software Maintenance and Reengineering, page 184, Washington, DC, USA. IEEE Computer Society.
Oh, J. (2009). Fight against 1-day exploits: Diffing binaries vs anti-diffing binaries. In Blackhat technical Security Conference.
Oliveira Cruz, A. J. (2010). C software. http://equipe.nce.ufrj.br/adriano/c/exemplos.htm. (Último acesso Junho 2010).
Poznyakoff, S. (2010). Gnu cflow. http://savannah.gnu.org/projects/cflow. (Último acesso Junho 2010).
Quinlan, D. and Panas, T. (2009). Source code and binary analysis of software defects. In CSIIRW ’09: Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research, pages 1–4, New York, NY, USA. ACM.
Reddy, C. S., Raju, K. V. S. V. N., Kumari, V. V., and Devi, G. L. (2007). Faultprone module prediction of a web application using artificial neural networks. In Proceeding (591) Software Engineering and Applications.
Thompson, K. (1984). Reflections on trusting trust. Commun. ACM, 27(8):761–763.
Wang, Z., Pierce, K., and McFarling, S. (2002). Bmat a binary matching tool for stale profile propagation. In The Journal of Instruction-Level Parallelism.
Zeng, H. and Rine, D. (2004). A neural network approach for software defects fix effort estimation. In IASTED Conf. on Software Engineering and Applications, pages 513–517.
Zhenga, J. (2007). Predicting software reliability with neural network ensembles. Expert Systems with Applications, (36):2116–2122.
Zhenga, J. (2009). A digital image encryption algorithm based on hyper-chaotic cellular neural network. Journal Fundamenta Informaticae.
Published
2010-10-11
How to Cite
NASCIMENTO, Tiago M.; CARMO, Luiz F. R. C.; BOCCARDO, Davidson R.; MACHADO, Raphael C.; PRADO, Charles B..
Rastreabilidade de Códigos Executáveis usando Redes Neurais. In: BRAZILIAN SYMPOSIUM ON CYBERSECURITY (SBSEG), 10. , 2010, Fortaleza.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2010
.
p. 397-310.
DOI: https://doi.org/10.5753/sbseg.2010.20595.
