Peptide-Protein Interface Classification Using Convolutional Neural Networks

Resumo


Peptides are short chains of amino acid residues linked through peptide bonds, whose potential to act as protein inhibitors has contributed to the advancement of rational drug design. Indeed, understanding the interactions between proteins and peptides is potentially helpful for several biotechnological applications. However, it is not a trivial task since peptides can adopt different conformations when interacting with proteins. In this paper, we develop a classification model for protein-peptide interfaces using a convolutional neural network and distance maps. To evaluate our proposal, we performed two case studies classifying protein-peptide interfaces based on peptide sequences and receptor classes. Additionally, we compared the distance map approach with a graph-based structural signatures approach. We aim to find out if a convolutional neural network could classify peptides just from the patterns of distances in these maps. In conclusion, graph-based methods were slightly superior in almost all comparisons performed. However, distance map-based signature methods achieved better results for some classes, such as classifying hormones, membranes, and viral proteins. These results shed light on the potential use of distance maps for classifying protein-peptide interfaces. Nevertheless, more experiments may be needed to explore this use.

Palavras-chave: Convolutional neural networks, distance maps, protein-peptide interactions

Referências

Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems, pp. 1–16 (2016). arXiv:1603.04467

Angelova, A., Drechsler, M., Garamus, V.M., Angelov, B.: Pep-lipid cubosomes and vesicles compartmentalized by micelles from self-assembly of multiple neuroprotective building blocks including a large peptide hormone PACAP-DHA. ChemNanoMat 5(11), 1381–1389 (2019). https://doi.org/10.1002/cnma.201900468

Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50

Chollet, F.: Deep Learning with Python, 4th edn. Manning, New York (2021)

Das, A.A., Sharma, O.P., Kumar, M.S., Krishna, R., Mathur, P.P.: PepBind: a comprehensive database and computational tool for analysis of protein-peptide interactions. Genom. Proteom. Bioinform. 11(4), 241–246 (2013). https://doi.org/10.1016/j.gpb.2013.03.002

Defresne, M., Sophie, B., Thomas, S.: Protein design with deep learning. Int. J. Mol. Sci. 22, 1741 (2021)

Demšar, J., et al.: Orange: data mining toolbox in Python. J. Mach. Learn. Res. 14(1), 2349–2353 (2013). https://doi.org/10.5555/2567709.2567736

Duda, R., Hart, P., Stork, G.: Pattern Classification, 2nd edn. Wiley, New York (2001)

Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. Adaptive Computation And Machine Learning, MIT Press, Cambridge (2016)

Ioffe, S., Szegedy, C.: Batch Normalization: accelerating deep network training by reducing internal covariate shift. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 448–456 (2015). arXiv:1502.03167

Iyer, M., Jaroszewski, L., Sedova, M., Godzik, A.: What the protein data bank tells us about the evolutionary conservation of protein conformational diversity. Protein Sci. 31, e4325 (2022). https://doi.org/10.1002/pro.4325

Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015 (2015). arXiv:1412.6980

Kloczkowski, A., et al.: Distance matrix-based approach to protein structure prediction. J. Struct. Funct. Genom. 10(1), 67–81 (2009). https://doi.org/10.1007/s10969-009-9062-2

Lau, J.L., Dunn, M.K.: Therapeutic peptides: historical perspectives, current development trends, and future directions. Bioorg. Med. Chem. 26(10), 2700–2707 (2018). https://doi.org/10.1016/j.bmc.2017.06.052

LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989). https://doi.org/10.1162/neco.1989.1.4.541

London, N., Movshovitz-Attias, D., Schueler-Furman, O.: The structural basis of peptide-protein binding strategies. Structure 18(2), 188–199 (2010). https://doi.org/10.1016/j.str.2009.11.012

Mariano, D., et al.: A computational method to propose mutations in enzymes based on structural signature variation (SSV). Int. J. Mol. Sci. 20(2), 333 (2019). https://doi.org/10.3390/ijms20020333

Martins, P.M., Santos, L.H., Mariano, D., et al.: Propedia: a database for protein-peptide identification based on a hybrid clustering algorithm. BMC Bioinform. 22, 1 (2021). https://doi.org/10.1186/s12859-020-03881-z

Martins, P., et al.: Propedia v2.3: a novel representation approach for the peptide-protein interaction database using graph-based structural signatures. Front. Bioinform. 3, 1103103 (2023). https://doi.org/10.3389/fbinf.2023.1103103

Melo, R.C., et al.: Finding protein-protein interaction patterns by contact map matching. Genet. Mol. Res. 6(4), 946–963 (2007)

Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinform. 18(5), 851–869 (2017). https://doi.org/10.1093/bib/bbw068

Mishkin, D., Sergievskiy, N., Matas, J.: Systematic evaluation of convolution neural network advances on the ImageNet. Comput. Vis. Image Underst. 161, 11–19 (2017). https://doi.org/10.1016/j.cviu.2017.05.007

Moreno-Camacho, C.A., Montoya-Torres, J.R., Jaegler, A., Gondran, N.: Sustainability metrics for real case applications of the supply chain network design problem: a systematic literature review. J. Clean. Prod. 231, 600–618. https://doi.org/10.1016/j.jclepro.2019.05.278

Mosteller, F., Tukey, J.: Data analysis, including statistics. In: Lindzey, G., Aronson, E. (eds.) Revised Handbook of Social Psychology, vol. 2, pp. 80–203 (1968)

Pires, D.E.V., de Melo-Minardi, R.C., da Silveira, C.H., Campos, F.F., Meira, W.: aCSM: noise-free graph-based signatures to large-scale receptor-based ligand prediction. Bioinformatics 29(7), 855–861 (2013). https://doi.org/10.1093/bioinformatics/btt058

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014). https://doi.org/10.5555/2627435.2670313

Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 2nd edn. Academic Press, Burlington (2009)

Vinogradov, A.A., Yin, Y., Suga, H.: Macrocyclic peptides as drug candidates: recent progress and remaining challenges. J. Am. Chem. Soc. 141(10), 4167–4181 (2019). https://doi.org/10.1021/jacs.8b13178

Webb, A., Copsey, K.: Statistical Pattern Recognition. Wiley, New York (2011)

Xu, M., Yoon, S., Fuentes, A., Park, D.S.: A Comprehensive survey of image augmentation techniques for deep learning. Pattern Recogn. 137, 109347 (2023). https://doi.org/10.1016/j.patcog.2023.109347
Publicado
13/06/2023
SANTOS, Lucas Moraes dos; MARIANO, Diego; BASTOS, Luana Luiza; CIOLETTI, Alessandra Gomes; DE MELO MINARDI, Raquel Cardoso. Peptide-Protein Interface Classification Using Convolutional Neural Networks. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 16. , 2023, Curitiba/PR. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 112-122. ISSN 2316-1248.