Peptide-Protein Interface Classification Using Convolutional Neural Networks
Resumo
Peptides are short chains of amino acid residues linked through peptide bonds, whose potential to act as protein inhibitors has contributed to the advancement of rational drug design. Indeed, understanding the interactions between proteins and peptides is potentially helpful for several biotechnological applications. However, it is not a trivial task since peptides can adopt different conformations when interacting with proteins. In this paper, we develop a classification model for protein-peptide interfaces using a convolutional neural network and distance maps. To evaluate our proposal, we performed two case studies classifying protein-peptide interfaces based on peptide sequences and receptor classes. Additionally, we compared the distance map approach with a graph-based structural signatures approach. We aim to find out if a convolutional neural network could classify peptides just from the patterns of distances in these maps. In conclusion, graph-based methods were slightly superior in almost all comparisons performed. However, distance map-based signature methods achieved better results for some classes, such as classifying hormones, membranes, and viral proteins. These results shed light on the potential use of distance maps for classifying protein-peptide interfaces. Nevertheless, more experiments may be needed to explore this use.
Referências
Angelova, A., Drechsler, M., Garamus, V.M., Angelov, B.: Pep-lipid cubosomes and vesicles compartmentalized by micelles from self-assembly of multiple neuroprotective building blocks including a large peptide hormone PACAP-DHA. ChemNanoMat 5(11), 1381–1389 (2019). https://doi.org/10.1002/cnma.201900468
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013). https://doi.org/10.1109/TPAMI.2013.50
Chollet, F.: Deep Learning with Python, 4th edn. Manning, New York (2021)
Das, A.A., Sharma, O.P., Kumar, M.S., Krishna, R., Mathur, P.P.: PepBind: a comprehensive database and computational tool for analysis of protein-peptide interactions. Genom. Proteom. Bioinform. 11(4), 241–246 (2013). https://doi.org/10.1016/j.gpb.2013.03.002
Defresne, M., Sophie, B., Thomas, S.: Protein design with deep learning. Int. J. Mol. Sci. 22, 1741 (2021)
Demšar, J., et al.: Orange: data mining toolbox in Python. J. Mach. Learn. Res. 14(1), 2349–2353 (2013). https://doi.org/10.5555/2567709.2567736
Duda, R., Hart, P., Stork, G.: Pattern Classification, 2nd edn. Wiley, New York (2001)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. Adaptive Computation And Machine Learning, MIT Press, Cambridge (2016)
Ioffe, S., Szegedy, C.: Batch Normalization: accelerating deep network training by reducing internal covariate shift. In: Bach, F., Blei, D. (eds.) Proceedings of the 32nd International Conference on International Conference on Machine Learning, vol. 37, pp. 448–456 (2015). arXiv:1502.03167
Iyer, M., Jaroszewski, L., Sedova, M., Godzik, A.: What the protein data bank tells us about the evolutionary conservation of protein conformational diversity. Protein Sci. 31, e4325 (2022). https://doi.org/10.1002/pro.4325
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015 (2015). arXiv:1412.6980
Kloczkowski, A., et al.: Distance matrix-based approach to protein structure prediction. J. Struct. Funct. Genom. 10(1), 67–81 (2009). https://doi.org/10.1007/s10969-009-9062-2
Lau, J.L., Dunn, M.K.: Therapeutic peptides: historical perspectives, current development trends, and future directions. Bioorg. Med. Chem. 26(10), 2700–2707 (2018). https://doi.org/10.1016/j.bmc.2017.06.052
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989). https://doi.org/10.1162/neco.1989.1.4.541
London, N., Movshovitz-Attias, D., Schueler-Furman, O.: The structural basis of peptide-protein binding strategies. Structure 18(2), 188–199 (2010). https://doi.org/10.1016/j.str.2009.11.012
Mariano, D., et al.: A computational method to propose mutations in enzymes based on structural signature variation (SSV). Int. J. Mol. Sci. 20(2), 333 (2019). https://doi.org/10.3390/ijms20020333
Martins, P.M., Santos, L.H., Mariano, D., et al.: Propedia: a database for protein-peptide identification based on a hybrid clustering algorithm. BMC Bioinform. 22, 1 (2021). https://doi.org/10.1186/s12859-020-03881-z
Martins, P., et al.: Propedia v2.3: a novel representation approach for the peptide-protein interaction database using graph-based structural signatures. Front. Bioinform. 3, 1103103 (2023). https://doi.org/10.3389/fbinf.2023.1103103
Melo, R.C., et al.: Finding protein-protein interaction patterns by contact map matching. Genet. Mol. Res. 6(4), 946–963 (2007)
Min, S., Lee, B., Yoon, S.: Deep learning in bioinformatics. Brief. Bioinform. 18(5), 851–869 (2017). https://doi.org/10.1093/bib/bbw068
Mishkin, D., Sergievskiy, N., Matas, J.: Systematic evaluation of convolution neural network advances on the ImageNet. Comput. Vis. Image Underst. 161, 11–19 (2017). https://doi.org/10.1016/j.cviu.2017.05.007
Moreno-Camacho, C.A., Montoya-Torres, J.R., Jaegler, A., Gondran, N.: Sustainability metrics for real case applications of the supply chain network design problem: a systematic literature review. J. Clean. Prod. 231, 600–618. https://doi.org/10.1016/j.jclepro.2019.05.278
Mosteller, F., Tukey, J.: Data analysis, including statistics. In: Lindzey, G., Aronson, E. (eds.) Revised Handbook of Social Psychology, vol. 2, pp. 80–203 (1968)
Pires, D.E.V., de Melo-Minardi, R.C., da Silveira, C.H., Campos, F.F., Meira, W.: aCSM: noise-free graph-based signatures to large-scale receptor-based ligand prediction. Bioinformatics 29(7), 855–861 (2013). https://doi.org/10.1093/bioinformatics/btt058
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(56), 1929–1958 (2014). https://doi.org/10.5555/2627435.2670313
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 2nd edn. Academic Press, Burlington (2009)
Vinogradov, A.A., Yin, Y., Suga, H.: Macrocyclic peptides as drug candidates: recent progress and remaining challenges. J. Am. Chem. Soc. 141(10), 4167–4181 (2019). https://doi.org/10.1021/jacs.8b13178
Webb, A., Copsey, K.: Statistical Pattern Recognition. Wiley, New York (2011)
Xu, M., Yoon, S., Fuentes, A., Park, D.S.: A Comprehensive survey of image augmentation techniques for deep learning. Pattern Recogn. 137, 109347 (2023). https://doi.org/10.1016/j.patcog.2023.109347