Newspaper Layout Recover Using Hierarchical Composed CNN Model

Vitor Ferreira De Carvalho; Vinicius Carbonezi De Souza; Gregory Silva; Andreza Santos; Filipe Costa; Luiz Pita

Vitor Ferreira De Carvalho CPQD
Vinicius Carbonezi De Souza CPQD
Gregory Silva CPQD
Andreza Santos CPQD
Filipe Costa CPQD
Luiz Pita CPQD

Resumo

Clipping service providers continuously monitor different media in which information can be published. In this context, automatic information recovery in printed communications vehicles poses a challenging task due to the variety of spatial layouts and overlapped elements organization. In this paper, we propose an instance segmentation-based approach that mimics the common hierarchical representation seen in mostly Brazilian newspapers by using two Convolutional Neural Networks (CNNs) connected sequentially. Due to the lack of publicly available datasets of newspapers with hierarchical labeling, we created our private dataset with the support of Brazilian clipping companies. However, due to privacy policies, this dataset cannot be publicly available. Experiments over the real-world manually annotated dataset demonstrate that the composition of models in a hierarchical way reported better performance metrics than not composed models. Also, our best model achieved a mAP (Mean Average Precision) of 0.63, serving as a case reference for document segmentation of hierarchical information, enabling further automation in the area.

Palavras-chave: Measurement, Image segmentation, Data privacy, Layout, MIMICs, Companies, Media, Convolutional neural networks, Labeling, Monitoring