Graph Attention Neural Networks Improving Molecular Docking Rank with Protein-Ligand Contact Maps

Glauco E. Lima; Simone Q. Pantaleão; Isabelle A. Pereira; Ana L. Scott

doi:10.5753/bsb.2024.245595

Glauco E. Lima UFABC
Simone Q. Pantaleão UFABC
Isabelle A. Pereira UFABC
Ana L. Scott UFABC

DOI: https://doi.org/10.5753/bsb.2024.245595

Resumo

Predicting the binding mode and affinity of small molecules to proteins is key to understanding their interaction. Empirical scoring functions are commonly used by docking programs, but accurately predicting them remains challenging. Docking programs can generate ligand conformations similar to crystallographic structures, yet scoring functions often struggle to identify the correct pose. This study employs Graph Attention Networks (GAT) to learn ligand-protein contact information and re-rank docking poses. Using PDBbindcore data, docking calculations with AutoDock Vina generate binding poses, evaluated by contacts and RMSD. Close contacts are mapped using BINANA, and bipartite graphs are created with atomic descriptors using RDKit.

Referências

Agarwal, R. and Smith, J. C. (2023). Speed vs accuracy: effect on ligand pose accuracy of varying box size and exhaustiveness in autodock vina. Molecular Informatics, 42(2):2200188.

Baber, J. C., Thompson, D. C., Cross, J. B., and Humblet, C. (2009). Gard: a generally applicable replacement for rmsd. Journal of Chemical Information and Modeling, 49(8):1889–1900.

Dias, R., de Azevedo, J., and Walter, F. (2008). Molecular docking algorithms. Current drug targets, 9(12):1040–1047.

Durrant, J. D. and McCammon, J. A. (2011). Binana: a novel algorithm for ligand-binding characterization. Journal of Molecular Graphics and Modelling, 29(6):888–893.

Eastman, P., Swails, J., Chodera, J. D., McGibbon, R. T., Zhao, Y., Beauchamp, K. A., Wang, L.-P., Simmonett, A. C., Harrigan, M. P., Stern, C. D., et al. (2017). Openmm 7: Rapid development of high performance algorithms for molecular dynamics. PLoS computational biology, 13(7):e1005659.

Eberhardt, J., Santos-Martins, D., Tillack, A. F., and Forli, S. (2021a). Autodock vina 1.2. 0: New docking methods, expanded force field, and python bindings. Journal of chemical information and modeling, 61(8):3891–3898.

Eberhardt, J., Santos-Martins, D., Tillack, A. F., and Forli, S. (2021b). Autodock vina 1.2. 0: New docking methods, expanded force field, and python bindings. Journal of chemical information and modeling, 61(8):3891–3898.

Frederick, R. I. and Bowden, S. C. (2009). The test validation summary. Assessment, 16(3):215–236.

Ganganwar, V. (2012). An overview of classification algorithms for imbalanced datasets. International Journal of Emerging Technology and Advanced Engineering, 2(4):42–47.

Greener, J. G., Kandathil, S. M., Moffat, L., and Jones, D. T. (2022). A guide to machine learning for biologists. Nature reviews Molecular cell biology, 23(1):40–55.

Imambi, S., Prakash, K. B., and Kanagachidambaresan, G. (2021). pytorch. Programming with TensorFlow: solution for edge computing applications, pages 87–104.

Landrum, G. (2013). Rdkit documentation. Release, 1(1-79):4.

Li, M., Zhang, X., Thrampoulidis, C., Chen, J., and Oymak, S. (2021). Autobalance: Optimized loss functions for imbalanced data. Advances in Neural Information Processing Systems, 34:3163–3177.

Liu, M., Li, C., Chen, R., Cao, D., and Zeng, X. (2023). Geometric deep learning for drug discovery. Expert Systems with Applications, page 122498.

Liu, Z., Li, Y., Han, L., Li, J., Liu, J., Zhao, Z., Nie, W., Liu, Y., and Wang, R. (2015). Pdb-wide collection of binding data: current status of the pdbbind database. Bioinformatics, 31(3):405–412.

Meng, X.-Y., Zhang, H.-X., Mezei, M., and Cui, M. (2011). Molecular docking: a powerful approach for structure-based drug discovery. Current computer-aided drug design, 7(2):146–157.

Morrone, J. A., Weber, J. K., Huynh, T., Luo, H., and Cornell, W. D. (2020). Combining docking pose rank and structure with deep learning improves protein–ligand binding mode prediction over a baseline docking approach. Journal of chemical information and modeling, 60(9):4170–4179.

Niu, Z., Zhong, G., and Yu, H. (2021). A review on the attention mechanism of deep learning. Neurocomputing, 452:48–62.

Orio, M., Pantazis, D. A., and Neese, F. (2009). Density functional theory. Photosynthesis research, 102:443–453.

Plewczynski, D., Łażniewski, M., Grotthuss, M. V., Rychlewski, L., and Ginalski, K. (2011). Votedock: consensus docking method for prediction of protein–ligand interactions. Journal of computational chemistry, 32(4):568–581.

Ramírez, D. and Caballero, J. (2018). Is it reliable to take the molecular docking top scoring position as the best solution without considering available structural data? Molecules, 23(5):1038.

Réau, M., Renaud, N., Xue, L. C., and Bonvin, A. M. (2023). Deeprank-gnn: a graph neural network framework to learn patterns in protein–protein interfaces. Bioinformatics, 39(1):btac759.

Rezaei-Dastjerdehei, M. R., Mijani, A., and Fatemizadeh, E. (2020). Addressing imbalance in multi-label classification using weighted cross entropy loss function. In 2020 27th national and 5th international iranian conference on biomedical engineering (ICBME), pages 333–338. IEEE.

Ruby, U. and Yendapalli, V. (2020). Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng, 9(10).

Schafer, R. W. (2011). What is a savitzky-golay filter?[lecture notes]. IEEE Signal pro cessing magazine, 28(4):111–117.

Sofaer, H. R., Hoeting, J. A., and Jarnevich, C. S. (2019). The area under the precision-recall curve as a performance metric for rare binary events. Methods in Ecology and Evolution, 10(4):565–577.

Su, M., Yang, Q., Du, Y., Feng, G., Liu, Z., Li, Y., and Wang, R. (2018). Comparative assessment of scoring functions: the casf-2016 update. Journal of chemical information and modeling, 59(2):895–913.

Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y., et al. (2017). Graph attention networks. stat, 1050(20):10–48550.

Xiong, G., Wu, Z., Yi, J., Fu, L., Yang, Z., Hsieh, C., Yin, M., Zeng, X., Wu, C., Lu, A., et al. (2021). Admetlab 2.0: an integrated online platform for accurate and comprehensive predictions of admet properties. Nucleic acids research, 49(W1):W5–W14.

Yang, Y. (2001). A study of thresholding strategies for text categorization. In Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 137–145.

Yang, Z., Zhong, W., Lv, Q., Dong, T., and Yu-Chian Chen, C. (2023). Geometric interaction graph neural network for predicting protein–ligand binding affinities from 3d structures (gign). The journal of physical chemistry letters, 14(8):2020–2033.

Youden, W. J. (1950). Index for rating diagnostic tests. Cancer, 3(1):32–35.

Yuan, H., Huang, J., and Li, J. (2021). Protein-ligand binding affinity prediction model based on graph attention network. Math. Biosci. Eng, 18(6):9148–9162.