Automatic Identification of Relevant Moments in Security Force Videos Using Multimodal Analysis

Resumo


Due to the increasing requirement for police officers to wear body cameras, there is an increased need for algorithms that can automatically detect relevant moments in footage. This paper presents an automated system that uses audio and video inputs to highlight key events in recordings, reducing the need for operators to watch entire videos. Our method detects firearms and crowd gatherings with object detection, identifies people raising their hands with pose estimation. We also detect sound patterns such as sirens, gunshots, and shouts and use Automatic Speech Recognition to transcribe conversations and identify keywords for relevant events. Our system, evaluated with videos from YouTube channels such as PMTVSP, PoliceActivity, and Code Blue Cam, effectively identifies significant moments in security footage where agents are engaged in activities beyond routine patrol, thus avoiding the need for a human to watch the entire video to point out relevant moments.
Palavras-chave: Egocentric Vision, Video understanding, Semantic Information, Security Forces
Publicado
06/11/2024
FERREIRA, Luísa; SILVA, Michel. Automatic Identification of Relevant Moments in Security Force Videos Using Multimodal Analysis. In: WORKSHOP DE VISÃO COMPUTACIONAL (WVC), 19. , 2024, Rio Paranaíba/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 67-74.

Artigos mais lidos do(s) mesmo(s) autor(es)

Obs.: Esse plugin requer que pelo menos um plugin de estatísticas/relatórios esteja habilitado. Se o seu plugins de estatísticas oferece mais que uma métrica, então, por favor, também selecione uma métrica principal na página de configurações administrativas do site e/ou da revista.