H.761 Support of a New concept Element and a New "recognition" Node-Event to Enable Deep Learning-based Analyses for Media-Nodes

  • Antônio Busson PUC-Rio
  • Alan L. V. Guedes PUC-Rio
  • Sergio Colcher PUC-Rio


Machine Learning field, methods based on Deep Learning (e.g. CNN, RNN) becomes the state-of-the-art in several problems of the multimedia domain, especially in audio-visual tasks. Typically, the training of Deep Learning Methods is done in a supervised manner, and it is trained on datasets containing thousands/millions of media examples and several related concepts/classes. During training, the Deep Learning Methods learn a hierarchy of filters that are applied to input data to classify/recognize the media content. In computer vision scenario, for example, given image pixels, the series of layers of the network can learn to extract visual features from it, the shallow layers can extract lower-level features (e.g. edges, corner, contours), while the deeper combine these features to produce higher-level features (e.g. textures, part of objects). These representative features can be clustered into groups, each one representing a specific concept. H.761 NCL currently lacks support for Deep Learning Methods inside their application specification. Because those languages still focus on presentations tasks such as capture, streaming, and presentation. They do not consider programmers to describe the semantic understanding of the used media and handle recognition of such under-standing. In this proposal, we aim at extending NCL to provide such support. More precisely, our proposal able NCL application support: (1) describe learning-based on structured multimedia datasets; (2) recognize content semantics of the media elements in presentation time. To achieve such goals, we propose, an extension that includes: (a) the new "knowledge" element describe concepts based on multimedia datasets; (b) "area" anchor with an associated "recognition" event that describes when a concept occurrences in multimedia content.
BUSSON, Antônio; GUEDES, Alan L. V.; COLCHER, Sergio. H.761 Support of a New concept Element and a New "recognition" Node-Event to Enable Deep Learning-based Analyses for Media-Nodes. In: WORKSHOP FUTURO DA TV DIGITAL INTERATIVA - SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA), 1. , 2019, Florianópolis. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2019 . p. 211-212. ISSN 2596-1683. DOI: https://doi.org/10.5753/webmedia_estendido.2019.8171.