Capturing the Behavior of Android Malware with MH-100K: A Novel and Multidimensional Dataset

Hendrio Bragança; Vanderson Rocha; Lucas Vilanova Barcellos; Eduardo Souto; Diego Kreutz; Eduardo Feitosa

doi:10.5753/sbseg.2023.233596

Hendrio Bragança UFAM
Vanderson Rocha UFAM
Lucas Vilanova Barcellos UNIPAMPA
Eduardo Souto UFAM
Diego Kreutz UNIPAMPA
Eduardo Feitosa UFAM

DOI: https://doi.org/10.5753/sbseg.2023.233596

Resumo

The fast pace proliferation of Android malware continues to pose challenges to cybersecurity research. To help reshape the future of malware research, we introduce the MH-100K, a dataset that provides a holistic view through 101,975 APK samples, thousands of diverse features and metadata. We use the VirusTotal API to ensure accurate threat evaluation, combining multiple detection methods for precision. Our findings suggest MH-100K is a valuable resource for providing new insights about the malware landscape’s evolution.

Referências

Aboaoja, F. A., Zainal, A., Ghaleb, F. A., Al-rimy, B. A. S., Eisa, T. A. E., and Elnour, A. A. H. (2022). Malware detection issues, challenges, and future directions: A survey. Applied Sciences, 12(17):8482.

AI & Data Today (2023). Top 10 reasons why ai projects fail. [link].

Bragança, H., Rocha, V., Souto, E., Kreutz, D., and Feitosa, E. (2023). Explaining the effectiveness of machine learning in malware detection: Insights from explainable AI. In Anais do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais, Porto Alegre, RS, Brasil. SBC.

Miranda, T. C., Gimenez, P.-F., Lalande, J.-F., Tong, V. V. T., and Wilke, P. (2022). Debiasing android malware datasets: How can i trust your results if your dataset is biased? IEEE Transactions on Information Forensics and Security, 17:2182–2197.

Palša, J., Ádám, N., Hurtuk, J., Chovancová, E., Madoš, B., Chovanec, M., and Kocan, S. (2022). Mlmd—a malware-detecting antivirus tool based on the xgboost machine learning algorithm. Applied Sciences, 12(13):6672.

Schmelzer, R. (2022). The one practice that is separating the ai successes from the failures. Forbes. [link].

Soares, T., Mello, J., Barcellos, L., Sayyed, R., Siqueira, G., Casola, K., Costa, E., Gustavo, N., Feitosa, E., and Kreutz, D. (2021a). Detecção de Malwares Android: Levantamento empírico da disponibilidade e da atualização das fontes de dados. In VI WRSeg.

Soares, T., Siqueira, G., Barcellos, L., Sayyed, R., Vargas, L., Rodrigues, G., Assolin, J., Pontes, J., Feitosa, E., and Kreutz, D. (2021b). Detecção de Malwares Android: datasets e reprodutibilidade. In VI WRSeg.

Vilanova, L., Kreutz, D., Assolin, J., Quincozes, V., Miers, C., Mansilha, R., and Feitosa, E. (2022). ADBuilder: uma ferramenta de construçao de datasets para detecçao de malwares android. In Anais Estendidos do XXII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais, pages 143–150. SBC.

Zakeya, N., Ségla, K., Chamseddine, T., and Alvine, B. B. (2022). Probing androvul dataset for studies on android malware classification. Journal of King Saud University-Computer and Information Sciences, 34(9):6883–6894.

Capturing the Behavior of Android Malware with MH-100K: A Novel and Multidimensional Dataset

Resumo

Referências

Artigos mais lidos do(s) mesmo(s) autor(es)