Digital Video Stabilization: Methods, Datasets, and Evaluation

  • Marcos Roberto e Souza UNICAMP
  • Helena de Almeida Maia UNICAMP
  • Hélio Pedrini UNICAMP

Resumo


Our thesis addressed digital video stabilization, a process that removes unwanted shakes from videos via software. We performed a thorough review, which resulted in two survey papers. We also studied and proposed a new stability measure aligned with human perception and a novel method for evaluating 2D camera motion to assess video quality better. Next, we introduced NAFT, a semi-online DWS with a new neighborhood-aware mechanism. This method stabilizes videos without relying on an explicit definition of stability. To train NAFT effectively, we created SynthStab, a paired synthetic dataset. NAFT achieves stabilization quality comparable to non-DWS methods, with a significantly smaller model (a 14× reduction).

Referências

M. Zhao and Q. Ling, “PWStableNet: Learning Pixel-Wise Warping Maps for Video Stabilization,” IEEE Transactions on Image Processing, vol. 29, pp. 3582–3595, 2020.

M. Wang, G.-Y. Yang, J.-K. Lin, S.-H. Zhang, A. Shamir, S.-P. Lu, and S.-M. Hu, “Deep Online Video Stabilization With Multi-Grid Warping Transformation Learning,” IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2283–2292, 2019.

M. R. Souza, H. A. Maia, and H. Pedrini, “Survey on Digital Video Stabilization: Concepts, Methods, and Challenges,” ACM Computing Surveys, vol. 55, no. 3, pp. 1–37, 2022.

M. R. Souza, H. A. Maia, and H. Pedrini, “Survey on Digital Video Stabilization: Datasets and Evaluation,” ACM Computing Surveys (submitted), 2023.

M. R. Souza, H. d. A. Maia, and H. Pedrini, “Rethinking Two-Dimensional Camera Motion Estimation Assessment for Digital Video Stabilization: A Camera Motion Field-based Metric,” Neurocomputing, p. 126768, 2023.

M. R. Souza, H. A. Maia, and H. Pedrini, “NAFT and SynthStab: A RAFT-based Network and a Synthetic Dataset for Digital Video Stabilization,” International Journal of Computer Vision (submitted), 2023.

M. Grundmann, V. Kwatra, and I. Essa, “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths,” in Conference on Computer Vision and Pattern Recognition. IEEE, Jun. 2011, pp. 225–232.

S. Liu, L. Yuan, P. Tan, and J. Sun, “Steadyflow: Spatially Smooth Optical Flow for Video Stabilization,” in IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 4209–4216.

S. Liu, L. Yuan, P. Tan, and J. Sun, “Bundled Camera Paths for Video Stabilization,” ACM Transactions on Graphics, vol. 32, no. 4, pp. 1–10, 2013.

Z. Li, C.-Z. Lu, J. Qin, C.-L. Guo, and M.-M. Cheng, “Towards an End-to-End Framework for Flow-Guided Video Inpainting,” in IEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 562–17 571.

H. Maia, M. e Souza, A. Santos, H. Pedrini, H. Tacon, A. Brito, H. Chaves, M. Vieira, and S. Villela, “Learnable Visual Rhythms Based on the Stacking of Convolutional Neural Networks for Action Recognition,” in International Conference on Machine Learning and Applications. IEEE, 2019, pp. 1–6.

M. R. Souza and H. Pedrini, “Visual Rhythms for Qualitative Evaluation of Video Stabilization,” EURASIP Journal on Image and Video Processing, vol. 2020, pp. 1–19, 2020.

M. R. Souza, H. de Almeida Maia, M. B. Vieira, and H. Pedrini, “Survey on Visual Rhythms: A Spatio-Temporal Representation for Video Sequences,” Neurocomputing, vol. 402, pp. 409–422, 2020.

M. R. Souza and H. Pedrini, “Digital Video Stabilization based on Adaptive Camera Trajectory Smoothing,” EURASIP Journal on Image and Video Processing, vol. 2018, no. 1, p. 37, 2018.

M. R. Souza and H. Pedrini, “Combination of Local Feature Detection Methods for Digital Video Stabilization,” Signal, Image and Video Processing, vol. 12, no. 8, pp. 1513–1521, 2018.

M. R. Souza and H. Pedrini, “Motion energy image for evaluation of video stabilization,” The Visual Computer, vol. 35, no. 12, pp. 1769–1781, 2019.

M. R. Souza, L. F. R. da Fonseca, and H. Pedrini, “Improvement of Global Motion Estimation in Two-Dimensional Digital Video Stabilisation Methods,” IET Image Processing, vol. 12, no. 12, pp. 2204–2211, 2018.

M. R. Souza, J. S. Conceição, J. L. Flores-Campana, L. G. Decker, D. C. Luvizon, G. S. P. Carvalho, H. A. Maia, and H. Pedrini, “Pyramidal Layered Scene Inference with Image Outpainting for Monocular View Synthesis,” in International Conference on Computer Analysis of Images and Patterns. Springer, 2021, pp. 37–46.

D. C. Luvizon, G. S. P. Carvalho, A. A. dos Santos, J. S. Conceicao, J. L. Flores-Campana, L. G. Decker, M. R. Souza, H. Pedrini, A. Joia, and O. A. Penatti, “Adaptive multiplane image generation from a single internet picture,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2556–2565.

A. Pinto, M. A. Córdova, L. G. Decker, J. L. Flores-Campana, M. R. Souza, A. A. dos Santos, J. S. Conceição, H. F. Gagliardi, D. C. Luvizon, R. d. S. Torres et al., “Parallax Motion Effect Generation Through Instance Segmentation and Depth Estimation,” in International Conference on Image Processing (ICIP). IEEE, 2020, pp. 1621–1625.

M. R. Souza, D. Bertolini, H. Pedrini, and Y. M. Costa, “Offline Handwritten Script Recognition Based on Texture Descriptors,” in 2019 International Conference on Systems, Signals and Image Processing (IWSSIP). IEEE, 2019, pp. 57–62.

H. d. A. Maia, M. R. Souza, A. C. S. Santos, J. C. M. Bobadilla, M. B. Vieira, and H. Pedrini, “Early Stopping for Two-Stream Fusion Applied to Action Recognition,” in International Joint Conference on Computer Vision, Imaging and Computer Graphics. Springer, 2020, pp. 319–333.

A. C. S. Santos, H. A. Maia, M. R. Souza, M. B. Vieira, and H. Pedrini, “Fuzzy Fusion for Two-stream Action Recognition,” in International Conference on Computer Vision Theory and Applications (VISAPP), 2020.

J. L. F. Campana, L. G. L. Decker, M. R. Souza, H. d. A. Maia, and H. Pedrini, “Multi-scale patch partitioning for image inpainting based on visual transformers,” in Conference on Graphics, Patterns and Images (SIBGRAPI), vol. 1. IEEE, 2022, pp. 180–185.

J. L. F. Campana, L. G. L. Decker, M. R. Souza, H. d. A. Maia, and H. Pedrini, “Variable-Hyperparameter Visual Transformer for Efficient Image Inpainting,” Computers & Graphics, vol. 113, pp. 57–68, 2023.

J. L. F. Campana, L. G. L. Decker, M. R. Souza, H. d. A. Maia, and H. Pedrini, “Image Inpainting on the Sketch-Pencil Domain with Vision Transformers,” International Conference on Computer Vision Theory and Applications (VISAPP), vol. 122, p. 132, 2024.

M. R. Souza, A. C. S. Santos, and H. Pedrini, “A Hybrid Approach Using the k-means and Genetic Algorithms for Image Color Quantization,” Recent Advances in Hybrid Metaheuristics for Data Clustering, pp. 151–171, 2020.

M. R. Souza, H. d. A. Maia, A. C. S. e. Santos, M. B. Vieira, and H. Pedrini, “Multi-Script Video Caption Localization Based on Visual Rhythms,” Applied Artificial Intelligence, vol. 36, no. 1, p. 2032926, 2022.

L. G. L. Decker, J. L. F. Campana, M. R. Souza, H. d. A. Maia, and H. Pedrini, “Zero-Shot Synth-to-Real Depth Estimation: From Synthetic Street Scenes to Real-World Data,” in Conference on Graphics, Patterns and Images (SIBGRAPI). IEEE, 2024.
Publicado
30/09/2024
SOUZA, Marcos Roberto e; MAIA, Helena de Almeida; PEDRINI, Hélio. Digital Video Stabilization: Methods, Datasets, and Evaluation. In: WORKSHOP DE TESES E DISSERTAÇÕES - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 37. , 2024, Manaus/AM. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 42-48. DOI: https://doi.org/10.5753/sibgrapi.est.2024.31643.

Artigos mais lidos do(s) mesmo(s) autor(es)