Skip to main content

Identifying Large Scale Conformational Changes in Proteins Through Distance Maps and Convolutional Networks

  • Conference paper
  • First Online:
Advances in Bioinformatics and Computational Biology (BSB 2022)

Abstract

Conformational changes in protein structures are strongly correlated with functional changes. Some conformational modifications may be easily noticeable, others are more subtle. In this work, we model the problem of protein conformation classification through its representation as images that illustrate the interatomic distance matrices. We aim then to discover if a convolutional neural network would be able to identify these conformational changes only from the distance patterns in these maps. Hence, this work presents the development of a model based on convolutional neural networks, capable of identifying large scale conformational changes in proteins. As a case study, we used the S protein from SARS-CoV-2, a protein known for its function in the infection of human cells through a conformational change to binding to the human cell receptor. Initially, we intend to identify large-scale conformations, such as states where the S protein trimers are closer together (closed) or further away (open). The proposed classifier achieved a satisfactory performance after cross validation, reaching an average accuracy in validation of 90.58%, with an error of 22.31%. The model was also able to successfully distinguish both classes (open and closed states for S protein) achieving a precision of 84.32% and a recall of 89%. In the test, the accuracy of the model reached 71.79%, with an error rate of 28.2%. Precision and recall reached 68.18% and 78.94%, respectively. For future work, we want to evaluate the ability of such model to identify even more subtle conformational changes, as well as those caused by point mutations that occur in virus variants.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems, pp. 1–16 (2016). https://doi.org/10.48550/arXiv.1603.04467

  2. AlQuraishi, M.: AlphaFold at CASP13. Bioinformatics 35(22), 4862–4865 (2019). https://doi.org/10.1093/bioinformatics/btz422

    Article  CAS  Google Scholar 

  3. Anishchenko, I., et al.: De novo protein design by deep network hallucination. Nature 600, 547–552 (2020). https://doi.org/10.1038/s41586-021-04184-w

    Article  CAS  Google Scholar 

  4. Baek, M., et al.: Accurate prediction of protein structures and interactions using a three-track neural network. Science 373(6557), 871–876 (2021). https://doi.org/10.1126/science.abj8754

    Article  CAS  Google Scholar 

  5. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  6. Berg, J.M., Tymoczko, J.L., Stryer, L.: Biochemistry. W.H. Freeman (2002)

    Google Scholar 

  7. Berman, H.: The protein data bank. Nucleic Acids Res. 28(1), 235–242 (2000). https://doi.org/10.1093/nar/28.1.235

    Article  CAS  Google Scholar 

  8. Chicco, D., Heider, D., Facchiano, A.: Editorial: artificial intelligence bioinformatics: development and application of tools for omics and inter-omics studies. Front. Genet. 11 (2020). https://doi.org/10.3389/fgene.2020.00309

  9. Chollet, F., et al.: Keras (2015). Keras

  10. Chollet, F.: Deep Learning with Python. Manning, 4th edn. (2021)

    Google Scholar 

  11. Defresne, M., Barbe, S., Schiex, T.: Protein design with deep learning. Int. J. Mol. Sci. 22(21), 11741 (2021). https://doi.org/10.3390/ijms222111741

    Article  CAS  Google Scholar 

  12. Dong, E., Du, H., Gardner, L.: An interactive web-based dashboard to track COVID-19 in real time. Lancet Inf. Dis. 20(5), 533–534 (2020). https://doi.org/10.1016/S1473-3099(20)30120-1

    Article  CAS  Google Scholar 

  13. Duda, R., Hart, P., Stork, G.: Pattern Classification, 2nd edn. Wiley, New York (2001)

    Google Scholar 

  14. Gao, W., Mahajan, S., Sulam, J., Gray, J.: Deep learning in protein structural modeling and design. Patterns 1 (2020). https://doi.org/10.1016/j.patter.2020.100142

  15. Goodsell, D., Dutta, S., Zardecki, C., Voigt, M., Berman, H., Burley, S.: The RCSB PDB molecule of the month: inspiring a molecular view of biology. PLoS Biol. 13(5), 1–12 (2015). https://doi.org/10.1371/journal.pbio.1002140

    Article  CAS  Google Scholar 

  16. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. Adaptive Computation and Machine Learning, MIT Press, Cambridge (2016)

    Google Scholar 

  17. Haykin, S.: Neural Networks - A Comprehensive Foundation. Pearson Prentice Hall, Upper Saddle River (2001)

    Google Scholar 

  18. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, ICML 2015, pp. 448–456 (2015). https://doi.org/10.48550/arXiv.1502.03167

  19. Iyer, M., Jaroszewski, L., Sedova, M., Godzik, A.: What the protein data bank tells us about the evolutionary conservation of protein conformational diversity. Protein Sci. 31(7) (2022). https://doi.org/10.1002/pro.4325

  20. Jumper, J., et al.: Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021-03819-2

    Article  CAS  Google Scholar 

  21. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. Published as a Conference Paper at the 3rd International Conference for Learning Representations, San Diego (2015). https://doi.org/10.48550/arXiv.1412.6980

  22. Kloczkowski, A., et al.: Distance matrix-based approach to protein structure prediction. J. Struct. Funct. Genomics 10(1), 67–81 (2009). https://doi.org/10.1007/s10969-009-9062-2

    Article  CAS  Google Scholar 

  23. Leach, A.: Molecular Modelling: Principles and Applications. Prentice Hall, New York (2001)

    Google Scholar 

  24. LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  25. Li, Q., et al.: Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N. Engl. J. Med. 382(13), 1199–1207 (2020). https://doi.org/10.1056/NEJMoa2001316

    Article  CAS  Google Scholar 

  26. Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinform. 18(5), 851–869 (2017). https://doi.org/10.1093/bib/bbw068

    Article  Google Scholar 

  27. Mishkin, D., Sergievskiy, N., Matas, J.: Systematic evaluation of CNN advances on the ImageNet. https://doi.org/10.1016/j.cviu.2017.05.007

  28. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)

    Google Scholar 

  29. Mosteller, F., Tukey, J.: Data analysis, including statistics. In: Lindzey, G., Aronson, E. (eds.) Revised Handbook of Social Psychology, vol. 2, pp. 80–203 (1968)

    Google Scholar 

  30. Narayanan, A., Keedwell, E., Olsson, B.: Artificial intelligence techniques for bioinformatics. Appl. Bioinform. 1, 191–222 (2002)

    CAS  Google Scholar 

  31. Nicolas, J.: Artificial intelligence and bioinformatics. In: Marquis, P., Papini, O., Prade, H. (eds.) A Guided Tour of Artificial Intelligence Research, pp. 209–264. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-06170-8_7

    Chapter  Google Scholar 

  32. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014). https://doi.org/10.5555/2627435.2670313

    Article  Google Scholar 

  33. Torrisi, M., Pollastri, G., Le, Q.: Deep learning methods in protein structure prediction. Comput. Struct. Biotechnol. J. 18, 1301–1310 (2020)

    Article  CAS  Google Scholar 

  34. Yang, J., Anishchenko, I., Park, H., Peng, Z., Ovchinnikov, S., Baker, D.: Improved protein structure prediction using predicted interresidue orientations. Proc. Natl. Acad. Sci. 117(3), 1496–1503 (2020). https://doi.org/10.1073/pnas.1914677117

    Article  CAS  Google Scholar 

  35. Walls, A., Park, Y., Tortorici, M., Wall, A., McGuire, A., Veesler, D.: Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell 181(2), 281–292 (2020). https://doi.org/10.1016/j.cell.2020.02.058

    Article  CAS  Google Scholar 

  36. Webb, A., Copsey, K.: Statistical Pattern Recognition. Wiley, New York (2011)

    Book  Google Scholar 

  37. Wu, F., et al.: A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020). https://doi.org/10.1038/s41586-020-2008-3

    Article  CAS  Google Scholar 

  38. Zhang, Y.: TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33(7), 2302–2309 (2005)

    Article  CAS  Google Scholar 

  39. Zhu, N., et al.: A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 382(8), 727–733 (2020). https://doi.org/10.1056/NEJMoa2001017

    Article  CAS  Google Scholar 

Download references

Acknowledgement

The authors thank the funding agencies: Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), Fundação de Amparo á Pesquisa do Estado de Minas Gerais (FAPEMIG), and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Moraes dos Santos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

dos Santos, L.M., de Melo Minardi, R.C. (2022). Identifying Large Scale Conformational Changes in Proteins Through Distance Maps and Convolutional Networks. In: Scherer, N.M., de Melo-Minardi, R.C. (eds) Advances in Bioinformatics and Computational Biology. BSB 2022. Lecture Notes in Computer Science(), vol 13523. Springer, Cham. https://doi.org/10.1007/978-3-031-21175-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21175-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21174-4

  • Online ISBN: 978-3-031-21175-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics