Predicting Mutation-Driven Changes in the SARS-CoV-2 Spike Protein Using Structural Signatures and Neural Networks

  • Eduardo U. M. Moreira UFMG
  • Leandro Morais UFMG
  • Sheila C. Araujo UFMG / UFSJ
  • Rafael P. Lemos UFMG
  • Ana Luísa A. Bastos UFMG
  • Alessandra Lima UFMG
  • Diego Mariano UFMG
  • Raquel C. de Melo-Minardi UFMG

Resumo


COVID-19, caused by the SARS-CoV-2 virus, has led to a global pandemic since 2020, resulting in nearly 7 million deaths. The virus’s rapid spread is due to more transmissible variants, many with spike glycoprotein mutations, which are key for cell invasion and a vaccine target. Understanding these mutations is crucial for preventing more dangerous variants. This study developed a computational method to predict the impact of mutations on the spike protein. Using data from 23,472 mutations, molecular modeling, graph-based structural signatures, and a machine-learning approach based on neural networks, the model analyzed 318 proteins, showing the methodology’s effectiveness in assessing the potential of new variants.

Referências

Alsharif, W. and Qurashi, A. (2021). Effectiveness of covid-19 diagnosis and management tools: A review. Radiography, 27(2):682–687.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol., 215(3):403–410.

Bayat, A. (2002). Bioinformatics. BMJ: British Medical Journal, 324(7344):1018–1022.

Bowie, J. U., Lüthy, and Eisenberg, D. (1991). a method to identify protein sequences that fold into a known three-dimensional structure. science, 253(5016):164–170.

Chen, J., Wang, R., Wang, M., and Wei, G. (2021). prediction and mitigation of mutation threats to covid-19 vaccines and antibody therapies. chemical science, 12(20):6929–6948.

Consortium, T. U. (2023). uniprot: the universal protein knowledgebase in 2023. nucleic acids research, 51(d1):d523–d531.

Cueno, M. E. and Imai, K. (2021). Structural comparison of the sars cov 2 spike protein relative to other human-infecting coronaviruses. Frontiers in Medicine, 7:1–8.

Demšar, J., Zupan, B., Leban, G., Curk, T., Starič, A., Petkovšek, E., Kavšek, B., and Polajnar, M. (2013). orange: data mining toolbox in python. journal of machine learning research, 14(71):2349–2353.

Goujon, M., McWilliam, H., Li, W., Valentin, F., Squizzato, S., Paern, J., and Lopez, R. (2010). A new bioinformatics analysis tools framework at EMBL-EBI. Nucleic Acids Res., 38(Web Server issue):W695–9.

Harvey, W. T. et al. (2021). Sars-cov-2 variants, spike mutations and immune escape. Nature Reviews Microbiology, 19(7):409–424.

Hilario, M. et al. (2004). Classifying protein fingerprints. In boulicaut, j.-f. et al., editors, Knowledge Discovery in Databases: PKDD 2004, volume 3202 of Lecture Notes in Computer Science, pages 209–220. Springer.

Ibrahim, B. et al. (2018). A new era of virus bioinformatics. Virus Research, 251:86–90.

Laskowski, R. A., Macarthur, M. W., Moss, D. S., and Thornton, J. M. (1993). procheck: a program to check the stereochemical quality of protein structures. j. appl. crystallogr., 26(2):283–291.

Lüthy, R., Bowie, J. U., and Eisenberg, D. (1992). assessment of protein models with three-dimensional profiles. nature, 356(6364):83–85.

Mariano, D. C. B., Santos, L. H., Machado, K. D. S., Werhli, A. V., de Lima, L. H. F., and de Melo-Minardi, R. C. (2019). A computational method to propose mutations in enzymes based on structural signature variation (SSV). Int. J. Mol. Sci., 20(2):333.

Mirdita, M. et al. (2022). colabfold: making protein folding accessible to all. nature methods, 19(6):679–682.

Moreira, E. U. M., Mariano, D. C. B., and de Melo-Minardi, R. C. (2024). computational analysis of mutations in sars-cov-2 variants spike protein and protein interactions. In features, transmission, detection, and case studies in covid-19, pages 123–139. elsevier.

Nieto-Torres, J. L. et al. (2015). Severe acute respiratory syndrome coronavirus e protein transports calcium ions and activates the nlrp3 inflammasome. Virology, 485:330–339.

Paiva, V. d. A. et al. (2022). Protein structural bioinformatics: An overview. Computers in Biology and Medicine, 147:105695.

Pires, D. E. V. et al. (2011). Cutoff scanning matrix (csm): structural classification and function prediction by protein inter-residue distance patterns. BMC Genomics, 12(Suppl 4):S12.

Pires, D. E. V. et al. (2013). acsm: noise-free graph-based signatures to large-scale receptor-based ligand prediction. bioinformatics, 29(7):855–861.

Rabaaan, A. A. et al. (2020). Sars-cov-2, sars-cov, and mers-cov: A comparative overview. Le Infezioni in Medicina, 28(2):174–184.

Ramachandran, G. N., Ramakrishnan, C., and Sasisekharan, V. (1963). stereochemistry of polypeptide chain configurations. j. mol. biol., 7(1):95–99.

Ribeiro, R. et al. (2023). Molecular modeling study of natural products as potential bioactive compounds against sars-cov-2. Journal of Molecular Modeling, 29(6):183.

Shen, M.-y. and Sali, A. (2006). Statistical potential for assessment and prediction of protein structures. Protein Science, 15(11):2507–2524.

Shukla, N. et al. (2023). Covid variants, villain and victory: A bioinformatics perspective. Microorganisms, 11(8):2039.

Sievers, F., Wilm, A., Dineen, D., Gibson, T. J., Karplus, K., Li, W., Lopez, R., McWilliam, H., Remmert, M., Söding, J., Thompson, J. D., and Higgins, D. G. (2011). Fast, scalable generation of high-quality protein multiple sequence alignments using clustal omega. Mol. Syst. Biol., 7(1):539.

Wang, M.-Y. et al. (2020). Sars-cov-2: Structure, biology, and structure-based therapeutics development. Frontiers in Cellular and Infection Microbiology, 10:587269.

Webb, B. and Sali, A. (2016). Comparative protein structure modeling using modeller. Current Protocols in Bioinformatics, 54:5.6.1–5.6.37.

Weisblum, Y. et al. (2020). Escape from neutralizing antibodies by sars-cov-2 spike protein variants. eLife, 9:e61312.

Weiss, S. R. and Navas-Martin, S. (2005). Coronavirus pathogenesis and the emerging pathogen severe acute respiratory syndrome coronavirus. Microbiology and Molecular Biology Reviews, 69(4):635–664.

WHO (2023). Who coronavirus (covid-19) dashboard. [link]. Accessed: 2023-09-05.

Yang, H. and Rao, Z. (2021). Structural biology of sars-cov-2 and implications for therapeutic development. Nature Reviews Microbiology, 19(11):685–700.

Zatorski, N. et al. (2022). Structural signatures: a web server for exploring a database of and generating protein structural features from human cell lines and tissues. Database: The Journal of Biological Databases and Curation, 2022:baac053.
Publicado
02/12/2024
MOREIRA, Eduardo U. M.; MORAIS, Leandro; ARAUJO, Sheila C.; LEMOS, Rafael P.; BASTOS, Ana Luísa A.; LIMA, Alessandra; MARIANO, Diego; MELO-MINARDI, Raquel C. de. Predicting Mutation-Driven Changes in the SARS-CoV-2 Spike Protein Using Structural Signatures and Neural Networks. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 17. , 2024, Vitória/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 167-178. ISSN 2316-1248. DOI: https://doi.org/10.5753/bsb.2024.245606.