Shadow-driven Document Representation

  • Matheus Silva Mota UNICAMP
  • Claudia Bauzer Medeiros UNICAMP


Document production tools are present everywhere, resulting in an exponential growth of increasingly complex, distributed and heterogeneous documents. This hampers document exchange, as well as their annotation, indexing and retrieval. Existing approaches to these tasks either concentrate on specific formats or require representing document’s content using interoperable standards or schema. This work presents our effort to handle this problem. Rather than trying to modify or convert the document itself, our strategy defines an intermediate and interoperable descriptor – shadow – that summarizes key aspects and elements of a given document, improving its annotation, indexation and retrieval process regardless of its format. Shadows can be used with different purposes, from semantic annotations and contextsensitive annotations, to content indexation and clustering.
MOTA, Matheus Silva; MEDEIROS, Claudia Bauzer. Shadow-driven Document Representation. In: WORKSHOP DE TESES E DISSERTAÇÕES - SIMPÓSIO BRASILEIRO DE SISTEMAS MULTIMÍDIA E WEB (WEBMEDIA) , 2011, Florianópolis. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2011 . p. 118-121. ISSN 2596-1683.