Application of Deep Learning Models in Estimating Spatial Relations of Objects to Assist Visually Impaired People

  • Aline Elí Gassenn IFAM
  • Marcelo Chamy Machado IFAM
  • Eulanda Miranda dos Santos UFAM

Abstract


In this paper, we explore computer vision and machine learning to develop an assistive algorithm for visually impaired people. Despite progress in assistive technologies, the literature revelas significant gaps in integrating real time object detection and depth estimation. The proposed methodology uses two pre-trained models, one for object detection (YOLO) and the other for depth estimation (MiDaS). The presented algorithm is able to interpret monocular images, informing the spatial relationship between detected objects and outputting text by sound. The analysis of performance considers the combination of different architectures in CPU and GPU, demonstratrating potential to improve the quality of life for visually impaired people.

References

Bauer, Z., Dominguez, A., Cruz, E., Gomez-Donoso, F., Orts-Escolano, S., and Cazorla, M. (2020). Enhancing perception for the visually impaired with deep learning techniques and low-cost wearable sensors. Pattern Recognition Letters, 137:27–36.

Birkl, R., Wofk, D., and Müller, M. (2023). MiDaS v3.1 – A Model Zoo for Robust Monocular Relative Depth Estimation. arXiv:2307.14460 [cs].

Davis, J., Hsieh, Y.-H., and Lee, H.-C. (2015). Humans perceive flicker artifacts at 500 Hz. Scientific Reports, 5(1):7861.

Izadmehr, Y., Satizábal, H. F., Aminian, K., and Perez-Uribe, A. (2022). Depth Estimation for Egocentric Rehabilitation Monitoring Using Deep Learning Algorithms. Applied Sciences, 12(13):6578.

Jadon, S., Taluri, S., Birthi, S., Mahesh, S., Kumar, S., Shashidhar, S. S., and Honnavalli, P. B. (2023). An Assistive Model for the Visually Impaired Integrating the Domains of IoT, Blockchain and Deep Learning. Symmetry, 15(9):1627.

Kaggle (2021). PASCAL VOC 2012 Dataset. Disponível em: [link]. Acesso em: 05 de julho de 2023.

Masoumian, A., Marei, D. G. F., Abdulwahab, S., Cristiano, J., Puig, D., and Rashwan, H. A. (2021). Absolute distance prediction based on deep learning object detection and monocular depth estimation models. arXiv:2111.01715 [cs].

Ming, Y., Meng, X., Fan, C., and Yu, H. (2021). Deep learning for monocular depth estimation: A review. Neurocomputing, 438:14–33.

Pazhoohi, F. and Kingstone, A. (2021). The Effect of Movie Frame Rate on Viewer Preference: An EyeTracking Study. Augmented Human Research, 6(1):2.

PyPI (2021). pyttsx3: Text to Speech (TTS) library for Python 2 and 3. Disponível em: [link]. Acesso em: 28 de julho de 2023.

Ranftl, R., Bochkovskiy, A., and Koltun, V. (2021). Vision Transformers for Dense Prediction. arXiv:2103.13413 [cs].

Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., and Koltun, V. (2022). Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3).

Ultralytics (2023). You Only Live Once (YOLO). Disponível em: [link]. Acesso em: 20 de junho de 2023.

Vijetha, U., G. V. (2024). Obs-tackle: an obstacle detection system to assist navigation of visually impaired using smartphones. Machine Vision and Applications, 35(20):1–19.

Wang, H.-M., Lin, H.-Y., and Chang, C.-C. (2021). Object Detection and Depth Estimation Approach Based on Deep Convolutional Neural Networks. Sensors, 21(14):4755.

Won, W.-C., Yong, Y.-L., and Khor, K.-C. (2021). Object Detection and Recognition for Visually Impaired Users: A Transfer Learning Approach. In 2021 2nd International Conference on Artificial Intelligence and Data Sciences (AiDAS), pages 1–6, IPOH, Malaysia. IEEE.

Zafar, S., Asif, M., Ahmad, M. B., Ghazal, T. M., Faiz, T., Ahmad, M., and Khan, M. A. (2022). Assistive Devices Analysis for Visually Impaired Persons: A Review on Taxonomy. IEEE Access, 10:13354–13366.
Published
2024-06-25
GASSENN, Aline Elí; MACHADO, Marcelo Chamy; SANTOS, Eulanda Miranda dos. Application of Deep Learning Models in Estimating Spatial Relations of Objects to Assist Visually Impaired People. In: BRAZILIAN SYMPOSIUM ON COMPUTING APPLIED TO HEALTH (SBCAS), 24. , 2024, Goiânia/GO. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 272-283. ISSN 2763-8952. DOI: https://doi.org/10.5753/sbcas.2024.2191.

Most read articles by the same author(s)