Capturing the Behavior of Android Malware with MH-100K: A Novel and Multidimensional Dataset
Resumo
The fast pace proliferation of Android malware continues to pose challenges to cybersecurity research. To help reshape the future of malware research, we introduce the MH-100K, a dataset that provides a holistic view through 101,975 APK samples, thousands of diverse features and metadata. We use the VirusTotal API to ensure accurate threat evaluation, combining multiple detection methods for precision. Our findings suggest MH-100K is a valuable resource for providing new insights about the malware landscape’s evolution.
Referências
AI & Data Today (2023). Top 10 reasons why ai projects fail. [link].
Bragança, H., Rocha, V., Souto, E., Kreutz, D., and Feitosa, E. (2023). Explaining the effectiveness of machine learning in malware detection: Insights from explainable AI. In Anais do XXIII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais, Porto Alegre, RS, Brasil. SBC.
Miranda, T. C., Gimenez, P.-F., Lalande, J.-F., Tong, V. V. T., and Wilke, P. (2022). Debiasing android malware datasets: How can i trust your results if your dataset is biased? IEEE Transactions on Information Forensics and Security, 17:2182–2197.
Palša, J., Ádám, N., Hurtuk, J., Chovancová, E., Madoš, B., Chovanec, M., and Kocan, S. (2022). Mlmd—a malware-detecting antivirus tool based on the xgboost machine learning algorithm. Applied Sciences, 12(13):6672.
Schmelzer, R. (2022). The one practice that is separating the ai successes from the failures. Forbes. [link].
Soares, T., Mello, J., Barcellos, L., Sayyed, R., Siqueira, G., Casola, K., Costa, E., Gustavo, N., Feitosa, E., and Kreutz, D. (2021a). Detecção de Malwares Android: Levantamento empírico da disponibilidade e da atualização das fontes de dados. In VI WRSeg.
Soares, T., Siqueira, G., Barcellos, L., Sayyed, R., Vargas, L., Rodrigues, G., Assolin, J., Pontes, J., Feitosa, E., and Kreutz, D. (2021b). Detecção de Malwares Android: datasets e reprodutibilidade. In VI WRSeg.
Vilanova, L., Kreutz, D., Assolin, J., Quincozes, V., Miers, C., Mansilha, R., and Feitosa, E. (2022). ADBuilder: uma ferramenta de construçao de datasets para detecçao de malwares android. In Anais Estendidos do XXII Simpósio Brasileiro em Segurança da Informação e de Sistemas Computacionais, pages 143–150. SBC.
Zakeya, N., Ségla, K., Chamseddine, T., and Alvine, B. B. (2022). Probing androvul dataset for studies on android malware classification. Journal of King Saud University-Computer and Information Sciences, 34(9):6883–6894.