Comparative Performance Analysis of GPU and CPU Processing for Digital Filtering Operations on Images Captured by the NAO Humanoid Robot

Resumo


This work presents a quantitative comparison of GPU and CPU performance for digital image filtering operations, using the NAO humanoid robot as the image source. Three algorithms were evaluated: Gaussian filtering, Sobel filtering, and FFT-based high-pass filtering, with experiments varying batch size (1 to 50 images) and resolution (320×240 to 1280×960 pixels). Results show that GPU advantage is operation-dependent: FFT achieved speedups of up to 63.26× at high resolution, while Gaussian filtering only outperformed CPU for larger batch sizes. Operation-specific efficiency thresholds were established, providing practical guidelines for architecture selection in robotic computer vision systems.

Referências

D. B. Kirk and W. W. Hwu, Programming Massively Parallel Processors: A HandsOn Approach, 3rd ed. San Francisco, CA, USA: Morgan Kaufmann, 2016. [Online]. Available: [link]

J. Sanders and E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming. Boston, MA, USA: Addison-Wesley, 2010. [Online]. Available: [link]

D. Gouaillier et al., “Mechatronic design of NAO humanoid,” in Proc. IEEE Int. Conf. Robotics and Automation (ICRA), Kobe, Japan, 2009, pp. 769–774, DOI: 10.1109/ROBOT.2009.5152516.

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004, DOI: 10.1023/B:VISI.0000029664.99615.94.

R. C. Gonzalez and R. E. Woods, Digital Image Processing, 4th ed. New York, NY, USA: Pearson, 2017. [Online]. Available: [link]

E. O. Brigham, The Fast Fourier Transform and its Applications. Englewood Cliffs, NJ: Prentice-Hall, 1988.

J. Nickolls et al., “Scalable parallel programming with CUDA,” Queue, vol. 6, no. 2, pp. 40–53, 2008, DOI: 10.1145/1365490.1365500.

M. Harris, “Optimizing CUDA,” in SC07 Supercomputing Conf., Reno, NV, USA, 2007. [Online]. Available: [link]

I. T. Young and L. J. Van Vliet, “Recursive implementation of the Gaussian filter,” Signal Processing, vol. 44, no. 2, pp. 139–151, 1995.

I. Sobel and G. Feldman, “A 3x3 isotropic gradient operator for image processing,” in Pattern Classification and Scene Analysis, pp. 271–272, 1968. [Online]. Available: [link]

J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Math. Comput., vol. 19, no. 90, pp. 297–301, 1965, DOI: 10.1090/S0025-5718-1965-0178586-1.

V. Volkov and B. Kazian, “Fitting FFT onto the G80 architecture,” Computer Science Division, University of California, Berkeley, Technical Report UCB/EECS-2008-132, 2008.

A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, 3rd ed. Boston, MA, USA: Pearson, 2009. [Online]. Available: [link]

V. W. Lee et al., “Debunking the 100X GPU vs. CPU myth: An evaluation of throughput computing on CPU and GPU,” ACM SIGARCH Comput. Archit. News, vol. 38, no. 3, pp. 451–460, 2010, DOI: 10.1145/1816038.1816021.

S. Che et al., “Rodinia: A benchmark suite for heterogeneous computing,” in Proc. IEEE Int. Symp. Workload Characterization (IISWC), Austin, TX, USA, 2009, pp. 44–54, DOI: 10.1109/IISWC.2009.5306797.

V. Volkov, “Better performance at lower occupancy,” in GPU Technology Conference, San Jose, CA, USA, 2010.

GOUAILLIER, David et al. The NAO humanoid: a combination of performance and affordability. arXiv preprint arXiv:0807.3223, 2008.

ALDEBARAN ROBOTICS. NAO Technical Documentation, Softbank Robotics, Technical Report, 2021. Disponível em: [link]. Acesso em: 2 set. 2025.

FICHT, Grzegorz et al. NimbRo-OP2X: Adult-sized open-source 3D printed humanoid robot. In: 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids), IEEE, 2018. p. 1-9.

HEREMANS, François et al. Bio-inspired balance controller for a humanoid robot. In: 2016 6th IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob), IEEE, 2016. p. 441-448.

O. Fialka and M. Cadik, “FFT and convolution performance in image filtering on GPU,” in Proc. 10th Int. Conf. Information Visualisation (IV), London, U.K., 2006, pp. 609–614, DOI: 10.1109/IV.2006.53.

V. Podlozhnyuk, “Image convolution with CUDA,” NVIDIA Corporation, White Paper, 2007.

J. L. Hennessy and D. A. Patterson, Computer Architecture: A Quantitative Approach, 6th ed. San Francisco, CA, USA: Morgan Kaufmann, 2017.

J. D. Owens et al., “GPU computing,” Proc. IEEE, vol. 96, no. 5, pp. 879–899, 2008, DOI: 10.1109/JPROC.2008.917757.
Publicado
19/07/2026
SOUZA, Vitor Amadeu. Comparative Performance Analysis of GPU and CPU Processing for Digital Filtering Operations on Images Captured by the NAO Humanoid Robot. In: WORKSHOP EM DESEMPENHO DE SISTEMAS COMPUTACIONAIS E DE COMUNICAÇÃO (WPERFORMANCE), 25. , 2026, Gramado/RS. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2026 . p. 153-164. ISSN 2595-6167. DOI: https://doi.org/10.5753/wperformance.2026.21081.