Comparing TensorFlow and PyTorch for Image Recognition in NAO Robot Soccer

Vitor Amadeu Souza; Hebert Azevedo Sá

doi:10.5753/semish.2025.7943

Vitor Amadeu Souza IME
Hebert Azevedo Sá IME

DOI: https://doi.org/10.5753/semish.2025.7943

Resumo

The study compares TensorFlow and PyTorch in image recognition tasks within NAO robot soccer. Image classes were created and trained using data augmentation to enhance robustness and generalization. The analysis considered training time, classification accuracy, and adaptability to different lighting conditions and angles. The results showed that TensorFlow outperformed PyTorch, achieving higher accuracy and better adaptation to challenging scenarios, making it more suitable for computer vision in dynamic environments. The novelty of this study lies in evaluating these frameworks on the NAO robot’s specific hardware, under realistic robotic conditions.

Referências

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning”, Nature, vol. 521, no. 7553, pp. 436–444, 2015.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks”, in Advances in Neural Information Processing Systems 25, 2012, pp. 1097–1105.

M. Abadi et al., “TensorFlow: A system for large-scale machine learning”, in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265–283.

A. Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library”, in Advances in Neural Information Processing Systems 32, 2019, pp. 8024–8035.

RoboCup Federation, “Standard Platform League Rules”, in RoboCup Soccer Humanoid League Rules, 2023.

D. Wei et al., “A survey on vision-based robotic grasping and manipulation”, IEEE Access, vol. 9, pp. 123214-123234, 2021.

C. Shorten and T. M. Khoshgoftaar, “A survey on Image Data Augmentation for Deep Learning”, Journal of Big Data, vol. 6, no. 1, pp. 1-48, 2019.

L. Taylor and G. Nitschke, “Improving Deep Learning with Generic Data Augmentation”, in IEEE Symposium Series on Computational Intelligence (SSCI), 2018, pp. 1542-1547.

Z. Zhang et al., “Deep Learning on Mobile Devices: A Review”, in IEEE International Conference on Multimedia and Expo (ICME), 2019, pp. 1516-1521.

S. Liu and W. Deng, “Very deep convolutional neural network based image classification using small training sample size”, in 3rd IAPR Asian Conference on Pattern Recognition (ACPR), 2015, pp. 730-734.

Y. Jia et al., “Caffe: Convolutional architecture for fast feature embedding”, in Proceedings of the 22nd ACM International Conference on Multimedia, 2014, pp. 675–678.

K. He et al., “Deep Residual Learning for Image Recognition”, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.

S. Ren et al., “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.

L. Seidenari et al., “Real-time object detection for robotic applications”, in IEEE International Conference on Robotics and Automation, 2012, pp. 4750–4755.

D. Gouaillier et al., “The NAO humanoid: a combination of performance and affordability”, IEEE Robotics & Automation Magazine, vol. 18, no. 3, pp. 12-25, 2011.

Aldebaran Robotics, “NAO Technical Documentation”, Softbank Robotics, Technical Report, 2021.

M. Schwarz et al., “NimbRo-OP2X: Adult-sized Open-source 3D Printed Humanoid Robot”, IEEE-RAS International Conference on Humanoid Robots, 2019.

R. Hartley and A. Zisserman, “Multiple View Geometry in Computer Vision”, Cambridge University Press, 2nd Edition, 2004.

H. Ishihara et al., “Real-time visual processing for humanoid robots using a GPU-embedded computer”, in IEEE-RAS International Conference on Humanoid Robots, 2008, pp. 305-310.

S. Ivaldi et al., “Anticipatory models of human movements and dynamics: the roadmap of the AnDy project”, in IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), 2015, pp. 688-695.

Y. Cui et al., “The design and implementation of a low-cost, high-performance control system for the NAO humanoid robot”, in 2013 IEEE International Conference on Robotics and Biomimetics (ROBIO), 2013, pp. 698-703.

S. Degallier et al., “Towards a bio-inspired control of a humanoid robot”, in 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008, pp. 5446-5451.

S. Bhattacharya and N. Vidyarthi, “A survey on humanoid robot control architectures”, in 2014 International Conference on Computing for Sustainable Global Development (INDIACom), 2014, pp. 837-842.

G. Seet, H. Fang, and C. Xiao, “A review on visual perception for robotic soccer”, in 2012 International Conference on Control, Automation and Information Sciences (ICCAIS), 2012, pp. 18-23.

S. Fuke and M. Tamada, “Visual perception and recognition for a humanoid robot”, in 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010, pp. 3553-3558.

Aldebaran, “NAO Cameras,” available at: [link]. [Accessed: Dec. 31, 2024].

S. Bianco et al., “Benchmark Analysis of Representative Deep Neural Network Architectures”, IEEE Access, vol. 6, pp. 64270-64277, 2018.

M. J. Shafiee et al., “Evolution in Groups: A deeper look at synaptic cluster driven evolution of deep neural networks”, Future Generation Computer Systems, vol. 98, pp. 430-440, 2019.

H. Kitano et al., “RoboCup: A challenge problem for AI and robotics”, in RoboCup 1997: Robot Soccer World Cup I, Springer, 1998, pp. 1-19.

M. Asada et al., “RoboCup: Today and tomorrow – What we have learned from RoboCup competitions”, Artificial Life and Robotics, vol. 24, no. 1, pp. 51-58, 2019.

T. Röfer et al., “B-Human Team Report and Code Release 2019”, B-Human, Technical Report, 2019.

RoboCup Technical Committee, “RoboCup Soccer Standard Platform League (NAO) Technical Report”, RoboCup Federation, Technical Report, 2023.

H. He and E. A. Garcia, “Learning from Imbalanced Data”, IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263-1284, 2009.

S. Liu et al., “Deep Learning in Object Detection: A Survey”, in International Conference on Information Technology in Medicine and Education (ITME), 2019, pp. 427-431.

V. Souza and H. S. Azevedo, “Benchmark codes and data,” available at: [link]. [Accessed: Dec. 31, 2024].

ResearchGate, NAO Vision, ”NAO robot head camera field of view,” available at: [link]. [Accessed: Dec. 31, 2024].

F. Florencio, T. Valenç, E. D. Moreno, and M. C. Junior, “Performance Analysis of Deep Learning Libraries: TensorFlow and PyTorch,” in Journal of Computer Science, vol. 15, no. 6, pp. 785-799, 2019. DOI: 10.3844/jcssp.2019.785.799

H. Dai, X. Peng, X. Shi, et al., “Reveal training performance mystery between TensorFlow and PyTorch in the single GPU environment,” in Sci. China Inf. Sci., vol. 65, p. 112103, 2022. DOI: 10.1007/s11432-020-3182-1