HarpIA: a tool for comparative analysis of embedded AI models on Android
Abstract
This work addresses the fragmentation of the Android ecosystem, which hinders the benchmarking of machine learning frameworks, making it a slow process that depends on frequent recompilations. To mitigate this limitation, we present HarpIA, a benchmarking tool whose dynamic architecture enables the evaluation of TensorFlow, PyTorch, and MindSpore models without the need to recompile the APK for each test. The study focused on validating the tool, whose reliability was verified by comparing inference time and energy consumption metrics—collected during tests with the ImageNet-V2 dataset on models converted via ONNX—with performance references from the literature. As a result, HarpIA emerges as a solution that supports developers in performing performance comparisons more efficiently, aiding in the selection of frameworks for the Android environment.References
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M., Levenberg, J., Monga, R., Moore, S., Murray, D. G., Steiner, B., Tucker, P., Vasudevan, V., Warden, P., Wicke, M., Yu, Y., and Zheng, X. (2016). Tensorflow: A system for large-scale machine learning.
Almeida, M., Laskaridis, S., Mehrotra, A., Dudziak, L., Leontiadis, I., and Lane, N. D. (2021). Smart at what cost? characterising mobile deep neural networks in the wild.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE.
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q. V., and Adam, H. (2019). Searching for mobilenetv3.
Hu, H., Huang, Y., Chen, Q., Zhuo, T. Y., and Chen, C. (2023). A first look at on-device models in ios apps. ACM Trans. Softw. Eng. Methodol., 33(1).
Huawei (2022). Huawei MindSpore AI Development Framework, page 137–162. Springer Nature Singapore.
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., and Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with 50x fewer parameters and ¡0.5mb model size.
Ignatov, A., Timofte, R., Chou, W., Wang, K., Wu, M., Hartley, T., and Van Gool, L. (2019). AI Benchmark: Running Deep Neural Networks on Android Smartphones, page 288–314. Springer International Publishing.
Luo, C., He, X., Zhan, J., Wang, L., Gao, W., and Dai, J. (2020). Comparison and benchmarking of ai models and frameworks on mobile devices.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library.
Reddi, V. J., Kanter, D., Mattson, P., Duke, J., Nguyen, T., Chukka, R., Shiring, K., Tan, K.-S., Charlebois, M., Chou, W., El-Khamy, M., Hong, J., John, T. S., Trinh, C., Buch, M., Mazumder, M., Markovic, R., Atta, T., Cakir, F., Charkhabi, M., Chen, X., Chiang, C.-M., Dexter, D., Heo, T., Schmuelling, G., Shabani, M., and Zika, D. (2020). Mlperf mobile inference benchmark.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2014). Going deeper with convolutions. CoRR, abs/1409.4842.
Tan, M., Chen, B., Pang, R., Vasudevan, V., and Le, Q. V. (2018). Mnasnet: Platformaware neural architecture search for mobile. CoRR, abs/1807.11626.
Xu, M., Liu, J., Liu, Y., Lin, F. X., Liu, Y., and Liu, X. (2019). A first look at deep learning apps on smartphones. In The World Wide Web Conference, WWW ’19. ACM.
Zhang, Q., Li, X., Che, X., Ma, X., Zhou, A., Xu, M., Wang, S., Ma, Y., and Liu, X. (2022). A comprehensive benchmark of deep learning libraries on mobile devices. In Proceedings of the ACM Web Conference 2022, WWW ’22. ACM.
Almeida, M., Laskaridis, S., Mehrotra, A., Dudziak, L., Leontiadis, I., and Lane, N. D. (2021). Smart at what cost? characterising mobile deep neural networks in the wild.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255. IEEE.
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q. V., and Adam, H. (2019). Searching for mobilenetv3.
Hu, H., Huang, Y., Chen, Q., Zhuo, T. Y., and Chen, C. (2023). A first look at on-device models in ios apps. ACM Trans. Softw. Eng. Methodol., 33(1).
Huawei (2022). Huawei MindSpore AI Development Framework, page 137–162. Springer Nature Singapore.
Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., and Keutzer, K. (2016). Squeezenet: Alexnet-level accuracy with 50x fewer parameters and ¡0.5mb model size.
Ignatov, A., Timofte, R., Chou, W., Wang, K., Wu, M., Hartley, T., and Van Gool, L. (2019). AI Benchmark: Running Deep Neural Networks on Android Smartphones, page 288–314. Springer International Publishing.
Luo, C., He, X., Zhan, J., Wang, L., Gao, W., and Dai, J. (2020). Comparison and benchmarking of ai models and frameworks on mobile devices.
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Köpf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., and Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library.
Reddi, V. J., Kanter, D., Mattson, P., Duke, J., Nguyen, T., Chukka, R., Shiring, K., Tan, K.-S., Charlebois, M., Chou, W., El-Khamy, M., Hong, J., John, T. S., Trinh, C., Buch, M., Mazumder, M., Markovic, R., Atta, T., Cakir, F., Charkhabi, M., Chen, X., Chiang, C.-M., Dexter, D., Heo, T., Schmuelling, G., Shabani, M., and Zika, D. (2020). Mlperf mobile inference benchmark.
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S. E., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2014). Going deeper with convolutions. CoRR, abs/1409.4842.
Tan, M., Chen, B., Pang, R., Vasudevan, V., and Le, Q. V. (2018). Mnasnet: Platformaware neural architecture search for mobile. CoRR, abs/1807.11626.
Xu, M., Liu, J., Liu, Y., Lin, F. X., Liu, Y., and Liu, X. (2019). A first look at deep learning apps on smartphones. In The World Wide Web Conference, WWW ’19. ACM.
Zhang, Q., Li, X., Che, X., Ma, X., Zhou, A., Xu, M., Wang, S., Ma, Y., and Liu, X. (2022). A comprehensive benchmark of deep learning libraries on mobile devices. In Proceedings of the ACM Web Conference 2022, WWW ’22. ACM.
Published
2025-09-29
How to Cite
MIRANDA FILHO, Ricardo; MATIAS, Pedro; FREITAS, Rosiane de.
HarpIA: a tool for comparative analysis of embedded AI models on Android. In: NATIONAL MEETING ON ARTIFICIAL AND COMPUTATIONAL INTELLIGENCE (ENIAC), 22. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 1938-1949.
ISSN 2763-9061.
DOI: https://doi.org/10.5753/eniac.2025.14267.
