Técnica de Clusterização não-hierárquica aplicada para a caracterização in silico de promotores associados a genes de choque térmico de Escherichia coli

  • Gabriel Dall’Alba UCS
  • Scheila de Avila e Silva UCS

Abstract


Computational techniques play an important role in the post-genomic era, due to the amount of biological data generated and released. In this context, this paper carry out an in silico analysis of δ24 and δ32-dependent promoter sequences. It was made by using the clustering technique with stability values as input data. The content of the clusters was analyzed by average purity obtained for the clusters. In general, all the clusters presented 63% of purity. However, the clusters 7, 10 and 11 obtained, respectively, 85%, 82% and 83% of purity. As conclusions, was possible to identify degrees of degeneration in the sequences and grouping features. After all, this paper contributes to the comprehension of the different biological bacterial promoter's profile. Furthermore, these results can be applied in the reducing of false positives of in silico promoter predicting tools.

References

Askary, A., Masoudi-Nejad, A., Sharafi, R., Mizbani, A., Parizi, S. N. e Purmasjedi, M. (2009). N4: A precise and highly sensitive promoter predictor using neural network fed by nearest neighbors. In Genes & Genetic Systems (84) (6), páginas 425-430.

Attwood, T. K. et al. (2011). Concepts, Historical Milestones and the Central Place of Bioinformatics in Modern Biology: A European Perspective, In Trends and Methodologies, Editado por Mahmood A. Mahdavi, InTech, Croácia.

Callebaut, W. (2012). Scientific perspectivism: A philosopher of science’s response to the challenge of big data biology. In Studies in History and Philosophy of Biological and Biomedical Sciences, páginas 69-80.

Crooks, G. E. et al. (2004). WebLogo: A Sequence Logo Generator. In Genome Research (14) (6), páginas 1188-1190.

de Avila e Silva, S. e Echeverrigaray, S. (2012). “Bacterial Promoter Features Description and Their Application on E. coli in silico Prediction and Recognition Approaches, In Bioinformatics, Editado por Horácio Pérez-Sánchez, InTech, Croácia.

de Avila e Silva, S., et al. (2014). DNA duplex stability as discriminative characteristic for Escherichia coli 54- and 28- dependent promoter sequences. In Biologicals (42) (1), páginas 22-28.

de Avila e Silva, S., Echeverrigaray, S. e Gerhardt, G. J. L. (2011). BacPP: Bacterial promoter prediction - A tool for accurate sigma-factor specific assignment in enterobacteria. In Journal of Theoretical Biology (287), páginas 92-99.

Gordon, L., et al. (2003). Sequence alignment for recognition of promoter regions. In Bioinformatics (19) (15), páginas 1964-1971.

Herbert., M. K. et al. (2006). Sequence-Resolved Detection of Pausing by Single RNA Polymerase Molecules. In Cell (125) (6), páginas 1083-1094.

Houten, B. V. e Kisker, C. (2014). Transcriptional pausing to scout ahead for DNA Damage. In Proceedings of the National Academy of Sciences (111) (11), páginas 3905-3906.

Jáuregui, R. et al. (2003). Conservation of DNA curvature signals in regulatory regions of prokaryotic genes. In Nucleic Acids Research, páginas 6770-6777.

Kanehisa, S. et al. (2014). Data, information, knowledge and principle: back to metabolism in KEGG. In Nucleic Acids Research (42), páginas D199-D205.

Kanhere, A. e Bansal, M (2005). A Novel method for prokaryotic promoter prediction based on DNA stability. In Bioinformatics (6) (1), páginas 1-10.

Kaushik, M. et al. (2016). A bouquet of DNA structures: Emerging diversity. In Biochemistry and Biophysics Reports (5), páginas 388-395.

Koo, B. M. et al. (2009). Dissection of recognition determinants of Escherichia coli δ32 suggets a composite -10 region with na ‘extended -10’ motif and a core -10 element. In Molecular Microbiology (72) (4), páginas 815-829.

Krebs, J.,Goldstein, S. e Kilpatrick, S. T. (2014) Genes XI, ed. Sudbury, Massachusetts: Jones and Bartlett, 930 p.

Lim, B. et al. (2013). Heat Shock Transcription Factor δ32 Co-opts the Signal Recognition Particle to Regulate Protein Homeostasis in E. coli. In PLOS Biology (11) (12), páginas 1-15.

Marx, V. (2013). The Big Challenges of Big Data. In Nature (498), páginas 255-260.

Ramprakash, J. e Schwarz, F. P. (2007) Identification and annotation of promoters regions in microbial genome sequences on the basis of DNA stability. In Journal of Biosciences (32), páginas 851-862.

Rangannan, V. e Bansal, M. (2007). Identification and annotation of promoter regions in microbial genome sequences on the basis of DNA stability. In Journal of Biosciences, páginas 851-862.

Rani, T. S., Bhavani, S. D. e Bapi, R. S. (2007). Analysis of E. coli promoter recognition problem in dinucleotide feature space. In Bioinformatics (23), páginas 582-588.

Salgado, H. et al. (2013). RegulonDB v. 8: omics data sets, evolutionary conservation, regulatory phrases, cross-validated gold standards and morre. In Nucleic Acids Research (41), páginas D203-D213.

SantaLucia J and Hicks D (2004). The thermodynamics of DNA structural motifs. Annual review of biophysics and biomolecular structure (33), páginas 415-440.

Vvedenskaya, I. O. et al. (2014). Interactions between RNA polymerase and the “core recognition element” counteract pausing. In Science (344), páginas 1285-1289.

Witten, I. H. e Frank, E. (2005). Data Mining: Practical Machine Learning Tools and Techniques. Ed. San Francisco: Morgan Kaufman, 560p.
Published
2016-07-04
DALL’ALBA, Gabriel; DE AVILA E SILVA, Scheila. Técnica de Clusterização não-hierárquica aplicada para a caracterização in silico de promotores associados a genes de choque térmico de Escherichia coli. In: BRAZILIAN E-SCIENCE WORKSHOP (BRESCI), 10. , 2016, Porto Alegre. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2016 . p. 229-236. ISSN 2763-8774. DOI: https://doi.org/10.5753/bresci.2016.9970.