Temporal Video Scene Segmentation By Fused Bags-of-Features

Rodrigo Mitsuo Kishi; Tiago Henrique Trojahn; Rudinei Goularte

Rodrigo Mitsuo Kishi USP-UFMS
Tiago Henrique Trojahn USP-IFSP
Rudinei Goularte USP

Resumo

Temporal segmentation of video into semantically coherent scenes is a fundamental step to enhance video operations like browsing, retrieval and recommendation. Available automatic scene segmentation methods in the literature are still far, in terms of efficacy, from reasonable practical application requirements. Towards to lowering this gap, this paper presents a new multimodal early fusion based scene segmentation method, which extends the classical and powerful singlemodal bags-of-features latent semantics discriminative capability to a multimodal paradigm. This approach was designed to refine the latent semantics from singlemodal data by identifying and representing audiovisual patterns while still preserving singlemodal visual/aural words patterns. Experiments have been performed over a publicly available dataset where the proposed method achieved higher average values for the FCO metric than previous state-of-the-art approaches.

Palavras-chave: Multimedia Systems, Video Scene Segmentation, Feature Fusion

Temporal Video Scene Segmentation By Fused Bags-of-Features

Resumo

Artigos mais lidos do(s) mesmo(s) autor(es)