Fusion of BLAST and Ensemble of Classifiers for Protein Secondary Structure Prediction

  • Gabriel B. Oliveira Unicamp
  • Helio Pedrini Unicamp
  • Zanoni Dias Unicamp

Resumo


The prediction of protein secondary structure has great relevance in the analysis of global protein folding. In this work, we present a method for protein secondary structure prediction using the fusion of BLAST and the ensemble of local and global classifiers. We used the amino acid sequence and sequence similarity information available in the datasets and we explored other amino acid characteristics. In order to evaluate our method, we used the files from PDB (only from the year 2018), as well as CB6133 and CB513 datasets. We achieved 87.7%, 82.4% and 85.6% Q8 accuracy on PDB 2018, CB6133 and CB513 proteins using the amino acid sequence and amino acid biological properties, 84.7% and 87.5% Q8 accuracy on CB6133 and CB513 proteins using the amino acid sequence and similarity sequence information and 92.5% Q3 accuracy on PDB 2018 proteins using the amino acid sequence and amino acid biological properties. Our method presented competitive results using only BLAST and only the ensemble of classifiers. The fusion of both approaches achieved superior results compared to state-of-the-art approaches.
Palavras-chave: Classifier Emsemble, Amino Acid Sequence, Protein Structure Prediction
Publicado
07/11/2020
OLIVEIRA, Gabriel B.; PEDRINI, Helio; DIAS, Zanoni. Fusion of BLAST and Ensemble of Classifiers for Protein Secondary Structure Prediction. In: CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 33. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 164-171.