skip to main content
10.1145/2820426.2820465acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
short-paper

Toward a Scoring Schema to Rank Candidate Instances of Ontological Classes: Extracting Brazilian Portuguese Texts from the Web

Published: 27 October 2015 Publication History

Abstract

With the emergence of Information Extraction Systems driven by ontologies, boosted by the Semantic Web, there is a need for the development of scoring schemas that enable the automatic classification of information. These schemas, even so little explored in the Portuguese language, provide measures used in the stage of classification of relevant instances to ontological classes. In this way, this paper presents: (i) a brief discussion about existing scoring measures based on PMI (Pointwise Mutual Information); (ii) new scoring measures based on PMI and Standard Deviation Calculation; and (iii) an evaluation of all discussed measures in the context of Brazilian Portuguese texts from the web.

References

[1]
P. Cimiano, S. Handschuh, and S. Staab. Towards the self-annotating web. Proceedings of the 13th conference on World Wide Web - WWW '04, page 462, 2004.
[2]
O. Etzioni, S. Kok, S. Soderland, M. Cafarella, A. m. Popescu, D. S. Weld, D. Downey, T. Shaked, and A. Yates. Web-Scale Information Extraction in KnowItAll (Preliminary Results). pages 100--110, 2004.
[3]
M. A. Hearst. Automatic Acquisition of Hyponyms from Large Text Corpora. pages 23--28, 1992.
[4]
F. Lima, H. Oliveira, and L. Salvador. An unsupervised method for ontology population from textual sources on the web. In Proceedings of the Annual Conference on Brazilian Symposium on Information Systems: Information Systems: A Computer Socio-Technical Perspective - Volume 1, SBSI 2015, pages 23:163--23:170, Porto Alegre, Brazil, Brazil, 2015. Brazilian Computer Society.
[5]
L. K. McDowell and M. Cafarella. Ontology-driven, unsupervised instance population. Web Semant., 6(3):218--236, Sept. 2008.
[6]
H. Tomaz, R. Lima, J. Emanoel, and F. Freitas. An unsupervised method for ontology population from the web. In J. Pavón, N. Duque-Méndez, and R. Fuentes-Fernández, editors, Advances in Artificial Intelligence -- IBERAMIA 2012, volume 7637 of Lecture Notes in Computer Science, pages 41--50. Springer Berlin Heidelberg, 2012.
[7]
P. Turney. Mining the web for synonyms: Pmi-ir versus lsa on toefl. 2001.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WebMedia '15: Proceedings of the 21st Brazilian Symposium on Multimedia and the Web
October 2015
266 pages
ISBN:9781450339599
DOI:10.1145/2820426
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • CYTED: Ciência Y Tecnologia Para El Desarrollo
  • SBC: Brazilian Computer Society
  • FAPEAM: Fundacao de Amparo a Pesquisa do Estado do Amazonas
  • CNPq: Conselho Nacional de Desenvolvimento Cientifico e Tecn
  • CGIBR: Comite Gestor da Internet no Brazil
  • CAPES: Brazilian Higher Education Funding Council

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification measures
  2. experiments
  3. pointwise mutual information

Qualifiers

  • Short-paper

Conference

Webmedia '15
Sponsor:
  • CYTED
  • SBC
  • FAPEAM
  • CNPq
  • CGIBR
  • CAPES

Acceptance Rates

WebMedia '15 Paper Acceptance Rate 21 of 61 submissions, 34%;
Overall Acceptance Rate 270 of 873 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 55
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media