Genetic Programming for Rule Generation Used in Protein Interaction Extraction from Texts

  • Ana Livia Rodrigues Queiroz USP
  • Evandro Eduardo Seron Ruiz USP
  • Renato Tinós USP

Abstract


In this work, a combination of syntax patterns used to extract protein-protein interactions from scientific text should be optimized. For this purpose, we present a system based on genetic programming (GP), an evolutionary algorithm that has symbolic expressions as individuals. GP allows the generation of new rules from a preliminary set of rules defined by an expert. The classification error obtained on a set of labeled examples is used as evaluation function. The training set used to evaluate the individuals is the BioCreAtIvE-PPI corpus, which contains textual information about interactions between proteins and /or genes.

References

ALBA, E.; LUQUE, G. & ARAÚJO, L. (2006). “Natural language tagging with genetic algorithms”, Information Processing Letters, 100: 173–182.

FUNDEL, K.; KÜFFNER, R. & ZIMMER, R. (2007). “Relex relation extraction using dependency parse trees” Bioinformatics, 23(3): 365-371

HAKENBERG, J.; BICKEL, S.; PLAKE, C.; BREFELD, U.; ZAHN, H.; FAULSTICH, L.; LESER, U. & SCHEFFER, T. (2005). “Systematic feature evaluation for gene name recognition”, BMC Bioinformatics, 6(1): 1471-2105.

LEHNINGER, A. L.; NELSON, D. L. & COX, M. M. (2005). “Lehninger Principles Of Biochemistry”. New York: Freeman, 4th edition.

POLI, R.; LANGDON, W. B & MCPHEE, N. F. (2008). “A field guide to genetic programming”. Published via [link] and freely available at [link].

PLAKE, C.; HAKENBERG, J. & LESER, U. (2005). “Optimizing syntax patterns for protein-protein interactions”, In the Proc. of the 2005 ACM Symp. on Applied Computing, 195-201.
Published
2012-07-16
QUEIROZ, Ana Livia Rodrigues; RUIZ, Evandro Eduardo Seron; TINÓS, Renato. Genetic Programming for Rule Generation Used in Protein Interaction Extraction from Texts. In: BRAZILIAN SYMPOSIUM ON COMPUTING APPLIED TO HEALTH (SBCAS), 12. , 2012, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2012 . p. 232-235. ISSN 2763-8952.

Most read articles by the same author(s)

1 2 > >>