Autoencoder-Based Super-Resolution Approach for Aerial Robot Navigation

  • Jhonathan A. Oliveira UFAM
  • Felipe G. Oliveira UFAM

Abstract


Imagens são amplamente utilizadas na navegação de veículos aéreos autônomos, mas seu processamento local pode exigir alta capacidade computacional e consumo energético. Como alternativa, este artigo propõe o método AESR (Autoencoder-based Super-Resolution) para viabilizar o processamento remoto, reconstruindo imagens de alta resolução a partir de versões de baixa resolução capturadas pelo veículo. A abordagem utiliza autoencoders com conexões de salto e módulos de autoatenção. Os resultados mostram que o AESR supera técnicas do estado da arte em métricas como PSNR, SSIM, LPIPS, DISTS, NIQE e BRISQUE, mesmo em cenários desafiadores.

References

Abdullah, Q., Shah, N. S. M., Mohamad, M., Ali, M. H. K., Farah, N., Salh, A., Aboali, M., Mohamad, M. A. H., and Saif, A. (2021). Real-time autonomous robot for object tracking using vision system. CoRR.

Angarano, S., Salvetti, F., Martini, M., and Chiaberge, M. (2023). Generative adversarial super-resolution at the edge with knowledge distillation. Engineering App. of Artificial Intelligence, 123:106407.

Bosse, S., Maniry, D., Müller, K.-R., Wiegand, T., and Samek, W. (2018). Deep neural networks for no-reference and full-reference image quality assessment. IEEE Trans. on Im. Proc., 27(1):206–219.

Ding, K., Ma, K., Wang, S., and Simoncelli, E. P. (2022). Image quality assessment: Unifying structure and texture similarity. IEEE Trans. on Pat. Analysis and Machine Intel., 44(5):2567–2581.

Dong, C., Loy, C. C., and Tang, X. (2016). Accelerating the super-resolution convolutional neural network. In Leibe, B., Matas, J., Sebe, N., and Welling, M., editors, Computer Vision – ECCV 2016, pages 391–407, Cham. Springer International Publishing.

Gonzalez, D., Patricio, M. A., Berlanga, A., and Molina, J. M. (2022). A super-resolution enhancement of uav images based on a convolutional neural network for mobile devices. Personal and Ubiquitous Computing, 26:1193–1204.

Lai, W.-S., Huang, J.-B., Ahuja, N., and Yang, M.-H. (2017). Deep laplacian pyramid networks for fast and accurate super-resolution. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5835–5843.

Lim, B., Son, S., Kim, H., Nah, S., and Lee, K. M. (2017). Enhanced deep residual networks for single image super-resolution. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1132–1140.

Lin, X., Ozaydin, B., Vidit, V., El Helou, M., and Süsstrunk, S. (2023). Dsr: Towards drone image super-resolution. In Karlinsky, L., Michaeli, T., and Nishino, K., editors, Computer Vision – ECCV 2022 Workshops, pages 361–377, Cham. Springer Nature Switzerland.

Liu, Z.-S., Siu, W.-C., and Chan, Y.-L. (2021). Photo-realistic image super-resolution via variational autoencoders. IEEE Transactions on Circuits and Systems for Video Technology, 31(4):1351–1365.

Mittal, A., Soundararajan, R., and Bovik, A. C. (2013). Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters, 20(3):209–212.

Oliveira, J. A., Drews, P. L. J., and Oliveira, F. G. (2024). Autoencoder-based super-resolution approach for aerial robot navigation. In 2024 Brazilian Symposium on Robotics (SBR) and 2024 Workshop on Robotics in Education (WRE), pages 85–90.

Pratt, W. K. (1978). Digital Image Processing. John Wiley & Sons, Nashville, TN.

Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., and Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In 2016 IEEE CVPR, pages 1874–1883.

Wang, X., Xie, L., Dong, C., and Shan, Y. (2021). Real-esrgan: Training real-world blind super-resolution with pure synthetic data. In 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 1905–1914.

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612.

Zhang, R., Isola, P., Efros, A. A., Shechtman, E., and Wang, O. (2018). The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595.
Published
2025-07-01
OLIVEIRA, Jhonathan A.; OLIVEIRA, Felipe G.. Autoencoder-Based Super-Resolution Approach for Aerial Robot Navigation. In: ICET TECHNOLOGY CONFERENCE (CONNECTECH), 2. , 2025, Itacoatiara/AM. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 270-277. DOI: https://doi.org/10.5753/connect.2025.12339.