Video scene segmentation through an early fusion multimodal approach

  • Rodrigo Mitsuo Kishi USP
  • Rudinei Goularte USP


Temporal segmentation of video into scenes is a prerequisite to various tasks on Multimedia Information Retrieval, like video summarization, content based video retrieval and video recommendation. There isn’t, however, a satisfactory method to automatically segment video into scenes. Stateof-the-art scene segmentation methods are multimodal, in order to match the multimodal nature of video. Aside from being multimodal, no true early fusion method was found in literature. Early fusion have shown to be useful in related multimedia tasks where potential correlation between data streams of different sources are discovered before the main processing step, improving results. Motivated by this situation, the proposal of this PhD Project is to investigate the impact of a true early fusion multimodal approach on the temporal video scene segmentation task.
KISHI , Rodrigo Mitsuo ; GOULARTE, Rudinei. Video scene segmentation through an early fusion multimodal approach. In: WORKSHOP DE TESES E DISSERTAÇÕES - SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA) , 2016, Teresina. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2016 . p. 41-46. ISSN 2596-1683.