Detecção de Malwares Android: datasets e reprodutibilidade

  • Taina Soares UNIPAMPA
  • Guilherme Siqueira UNIPAMPA
  • Lucas Barcellos UNIPAMPA
  • Renato Sayyed UNIPAMPA
  • Luciano Vargas UNIPAMPA
  • Gustavo Rodrigues UNIPAMPA
  • Joner Assolin UNIPAMPA
  • Jonas Pontes UFAM
  • Eduardo Feitosa UFAM
  • Diego Kreutz UNIPAMPA


Neste trabalho nós avaliamos uma amostra inicial de 38 trabalhos de pesquisa que utilizam aprendizado de máquina para detecção de malwares Android. Analisamos, em particular, o detalhamento e a disponibilidade dos datasets, que são cruciais para a validação e a reprodutibilidade do trabalho. Nossos resultados sugerem que 100% das pesquisas não são reprodutíveis por falta de informações e/ou acesso aos dados originais da pesquisa.


Alazab, M., Alazab, M., Shalaginov, A., Mesleh, A., and Awajan, A. (2020). Intelligent mobile malware detection using permission requests and api calls. Future Generation Computer Systems, 107:509–521.

Ali, M. A., Svetinovic, D., Aung, Z., and Lukman, S. (2017). Malware detection in android mobile platform using machine learning algorithms. In 2017 International Conference on Infocom Technologies and Unmanned Systems (Trends and Future Directions) (ICTUS), pages 763–768.

Amos, B., Turner, H., and White, J. (2013). Applying machine learning classifiers to dynamic android malware detection at scale. In 9th International Wireless Communications and Mobile Computing Conference (IWCMC), pages 1666–1671.

Arora, A., Peddoju, S. K., Chouhan, V., and Chaudhary, A. (2018). Hybrid android malware detection by combining supervised and unsupervised learning. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking, page 798–800. ACM.

Arslan, R. S., Dogru, I. A., and Barisci, N. (2019). Permission-based malware detection system for android using machine learning techniques. International journal of software engineering and knowledge engineering., 29(01):43–61.

Burguera, I., Zurutuza, U., and Nadjm-Tehrani, S. (2011). Crowdroid: Behavior-based malware detection system for android. In Proceedings of the 1st ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, page 15–26. ACM.

Cen, L., Gates, C. S., Si, L., and Li, N. (2015). A probabilistic discriminative model for android malware detection with decompiled source code. IEEE Transactions on Dependable and Secure Computing, 12(4):400–412.

Chawla, N., Kumar, H., and Mukhopadhyay, S. (2021). Machine learning in wavelet domain for electromagnetic emission based malware analysis. IEEE Transactions on Information Forensics and Security, 16:3426–3441.

Chen, X., Li, C., Wang, D., Wen, S., Zhang, J., Nepal, S., Xiang, Y., and Ren, K. (2020). Android hiv: A study of repackaging malware for evading machine-learning detection. IEEE Transactions on Information Forensics and Security, 15:987–1001.

Chen, Z., Yan, Q., Han, H., Wang, S., Peng, L., Wang, L., and Yang, B. (2018). Machine learning based mobile malware detection using highly imbalanced network traffic. Information Sciences, 433-434:346–364.

Demontis, A., Melis, M., Biggio, B., Maiorca, D., Arp, D., Rieck, K., Corona, I., Giacinto, G., and Roli, F. (2019). Yes, machine learning can be more secure! a case study on android malware detection. IEEE Transactions on Dependable and Secure Computing, 16(4):711–724.

Fan, M., Liu, J., Wang, W., Li, H., Tian, Z., and Liu, T. (2017). Dapasa: Detecting android piggybacked apps through sensitive subgraph analysis. IEEE Transactions on Information Forensics and Security, 12(8):1772–1785.

Ferrante, A., Malek, M., Martinelli, F., Mercaldo, F., and Milosevic, J. (2018). Extinguishing ransomware a hybrid approach In Imine, A., Fernandez, J. M., Marion, J.-Y., Logrippo, L., and Garcia-Alfaro, J., editors, to android ransomware detection. Foundations and Practice of Security, pages 242–258, Cham. Springer International Publishing.

Gates, C. S., Li, N., Peng, H., Sarma, B., Qi, Y., Potharaju, R., Nita-Rotaru, C., and Molloy, I. (2014). Generating summary risk scores for mobile applications. IEEE Transactions on Dependable and Secure Computing, 11(3):238–251.

Jordaney, R., Sharad, K., Dash, S. K., Wang, Z., Papini, D., Nouretdinov, I., and Cavallaro, L. (2017). Transcend: Detecting concept drift in malware classification models. In 26th USENIX Security Symposium, pages 625–642. USENIX Association.

Jung, J., Kim, H., Shin, D., Lee, M., Lee, H., Cho, S.-j., and Suh, K. (2018). Android malware detection based on useful api calls and machine learning. In IEEE First International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), pages 175–178.

Kabakus, A. T. and Dogru, I. A. (2018). An in-depth analysis of android malware using hybrid techniques. Digital Investigation, 24:25–33.

Kouliaridis, V., Kambourakis, G., and Peng, T. (2020). Feature importance in android malware detection. In IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pages 1449–1454.

Kumars, R., Alazab, M., and Wang, W. (2021). A Survey of Intelligent Techniques for Android Malware Detection, pages 121–162. Springer International Publishing, Cham.

Li, C., Chen, X., Wang, D., Wen, S., Ahmed, M. E., Camtepe, S., and Xiang, Y. (2021). Backdoor attack on machine learning based android malware detectors. IEEE Transactions on Dependable and Secure Computing, pages 1–1.

Li, J., Sun, L., Yan, Q., Li, Z., Srisa-an, W., and Ye, H. (2018). Significant permission identification for machine-learning-based android malware detection. IEEE Transactions on Industrial Informatics, 14(7):3216–3225.

Ma, Z., Ge, H., Liu, Y., Zhao, M., and Ma, J. (2019). A combination method for android malware detection based on control flow graphs and machine learning algorithms. IEEE Access, 7:21235–21245.

Mahindru, A. and Sangal, A. L. (2021). MLDroid—framework for Android malware detection using machine learning techniques. Neural Computing and Applications, 33(10):5183–5240.

Mahindru, A. and Singh, P. (2017). Dynamic permissions based android malware detection using machine learning techniques. In Proceedings of the 10th Innovations in Software Engineering Conference, page 202–210. ACM.

Mas’ud, M. Z., Sahib, S., Abdollah, M. F., Selamat, S. R., and Yusof, R. (2014). Analysis of features selection and machine learning classifier in android malware detection. In International Conference on Information Science Applications, pages 1–5.

Narudin, F. A., Feizollah, A., Anuar, N. B., and Gani, A. (2016). Evaluation of machine learning classifiers for mobile malware detection. Soft Computing, 20(1):343–357.

Patel, K. and Buddadev, B. (2015). Detection and mitigation of android malware through hybrid approach. In Abawajy, J. H., Mukherjea, S., Thampi, S. M., and Ruiz-Martínez, A., editors, Security in Computing and Communications, pages 455–463, Cham. Springer International Publishing.

Peiravian, N. and Zhu, X. (2013). Machine learning for android malware detection using permission and api calls. In IEEE 25th International Conference on Tools with Artificial Intelligence, pages 300–305.

Pendlebury, F., Pierazzi, F., Jordaney, R., Kinder, J., and Cavallaro, L. (2019). TESSERACT: Eliminating experimental bias in malware classification across space and time. In 28th USENIX Security Symposium (USENIX Security 19), pages 729–746, Santa Clara, CA. USENIX Association.

Sahs, J. and Khan, L. (2012). A machine learning approach to android malware detection. In European Intelligence and Security Informatics Conference, pages 141–147.

Shabtai, A., Kanonov, U., Elovici, Y., Glezer, C., and Weiss, Y. (2012). “Andromaly”: a behavioral malware detection framework for android devices. Journal of Intelligent Information Systems, 38(1):161–190.

Sharma, T. and Rattan, D. (2021). Malicious application detection in android — a systematic literature review. Computer Science Review, 40:100373.

Soares, T., Siqueira, G., Barcellos, L., Sayyed, R., Vargas, L., Rodrigues, G., Assolin, J., Pontes, J., Feitosa, E., and Kreutz, D. (2021). Detecção de malwares android: datasets e reprodutibilidade.

Vinod, P., Zemmari, A., and Conti, M. (2019). A machine learning based approach to detect malicious android apps using discriminant system calls. Future Generation Computer Systems., 94:333–350.

Wang, S., Chen, Z., Yan, Q., Yang, B., Peng, L., and Jia, Z. (2019). A mobile malware detection method using behavior features in network traffic. Journal of Network and Computer Applications, 133:15–25.

Wu, W.-C. and Hung, S.-H. (2014). Droiddolphin: A dynamic android malware detection framework using big data and machine learning. In Proceedings of the 2014 Conference on Research in Adaptive and Convergent Systems, page 247–252. ACM.

Xu, K., Li, Y., and Deng, R. H. (2016). Iccdetector: Icc-based malware detection on android. IEEE Transactions on Information Forensics and Security, 11(6):1252–1264.

Yerima, S. Y., Sezer, S., and Muttik, I. (2014). Android malware detection using parallel machine learning classifiers. In Eighth International Conference on Next Generation Mobile Apps, Services and Technologies, pages 37–42.

Yuan, Z., Lu, Y., Wang, Z., and Xue, Y. (2014). Droid-sec: Deep learning in android malware detection. SIGCOMM Comput. Commun. Rev., 44(4):371–372.

Yuan, Z., Lu, Y., and Xue, Y. (2016). Droiddetector: android malware characterization and detection using deep learning. Tsinghua Science and Technology, 21(1):114–123.

Zarni Aung, W. Z. (2013). Permission-based android malware detection. International Journal of Scientific & Technology Research, 2(3):228–234.

Zhu, H.-J., You, Z.-H., Zhu, Z.-X., Shi, W.-L., Chen, X., and Cheng, L. (2018). Droiddet: Effective and robust detection of android malware using static analysis along with rotation forest model. Neurocomputing, 272:638–646.
SOARES, Taina et al. Detecção de Malwares Android: datasets e reprodutibilidade. In: ESCOLA REGIONAL DE REDES DE COMPUTADORES (ERRC), 19. , 2021, Charqueadas/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 43-48. DOI: