On Using Wikipedia to Build Knowledge Bases for Information Extraction by Text Segmentation

Authors

  • Elton Serra Universidade Federal do Amazonas
  • Eli Cortez Universidade Federal do Amazonas http://www.dcc.ufam.edu.br/~eccv
  • Altigran S. da Silva Universidade Federal do Amazonas
  • Edleno S. de Moura Universidade Federal do Amazonas

DOI:

https://doi.org/10.5753/jidm.2011.1408

Keywords:

Data Management, Information Extraction, Knowledge Bases

Abstract

We propose a strategy for automatically obtaining datasets from Wikipedia to support unsupervised Information Extraction by Text Segmentation (IETS) methods. Despite the importance of preexisting datasets to unsupervised IETS methods, there has been no proper discussion in the literature on how such datasets can be effectively  obtained or built. We report experiments in which three state-of-the-art unsupervised IETS methods use datasets obtained using our proposed strategy under several configurations, involving IETS tasks on three different domains. The results suggest that our strategy is valid and effective, and that IETS methods can achieve a very good performance if the datasets generated have a reasonable number of representative values on the domain of the data to be extracted.

Downloads

Download data is not yet available.

Downloads

Published

2011-10-04

How to Cite

Serra, E., Cortez, E., da Silva, A. S., & de Moura, E. S. (2011). On Using Wikipedia to Build Knowledge Bases for Information Extraction by Text Segmentation. Journal of Information and Data Management, 2(3), 259. https://doi.org/10.5753/jidm.2011.1408

Issue

Section

SBBD Articles