Fusing Scene Context to Improve Object Recognition


  • Leandro P. da Silva Pontifícia Universidade Católica do Rio Grande do Sul
  • Roger Granada Pontifícia Universidade Católica do Rio Grande do Sul
  • Juarez Monteiro Pontifícia Universidade Católica do Rio Grande do Sul
  • Duncan D. Ruiz Pontifícia Universidade Católica do Rio Grande do Sul




convolutional neural networks, neural networks, object recognition


Computer vision is a branch of science that seeks to give computers the capability of seeing the world around them. Among its tasks, object recognition aims to classify objects and to identify where each object is in a given image. As objects tend to occur in particular environments, their contextual association can be useful for improving the object recognition task. To address the contextual awareness in object recognition tasks, our approach aims to use the context of the scenes in order to achieve higher quality in object recognition, by fusing context information with object
detection features. Hence, we propose a novel architecture composed of two convolutional neural networks based on two well-known pre-trained nets: Places365-CNN and Faster R-CNN. Our two-streams architecture uses the concatenation of object features with scene context features in a late fusion approach. We performed experiments using public datasets (PASCAL VOC 2007, MS COCO and a subset of SUN09) analyzing the performance of our architecture with different threshold scores. Results show that our approach is able to raise in-context object scores, and reduces out-of-context objects scores.


Download data is not yet available.


How to Cite

P. da Silva, L., Granada, R., Monteiro, J., & D. Ruiz, D. (2018). Fusing Scene Context to Improve Object Recognition. Journal of Information and Data Management, 9(2), 147. https://doi.org/10.5753/jidm.2018.2050

