Longest Common Subsequence Aplicada à Comparação de Proteínas
Resumo
Este trabalho realiza uma comparação do desempenho de dois algoritmos na busca da maior subsequência comum (LCS - Longest Common Sub- sequence) entre sequências de proteínas de organismos vivos. Para isso foram utilizadas sequências de diversos tamanhos e realizados testes com os algoritmos de Wagner-Fischer e Needleman-Wunsch. Foi observado que conforme o tamanho da entrada cresce, o tempo de execução aumenta linearmente para ambos os algoritmos, contudo, apresentando melhores resultados para o algoritmo de Wagner-Fischer.
Referências
Garey, M. and Johnson, D. (1979). Computers and Intractability: A Guide to the Theory of NP-completeness. Mathematical Sciences Series. W. H. Freeman.
Howe, K. L., Achuthan, P., Allen, J., Allen, J., Alvarez-Jarreta, J., Amode, M. R., Armean, I. M., Azov, A. G., Bennett, R., Bhai, J., Billis, K., Boddu, S., Charkhchi, M., Cummins, C., Da Rin Fioretto, L., Davidson, C., Dodiya, K., El Houdaigui, B., Fatima, R., Gall, A., Garcia Giron, C., Grego, T., Guijarro-Clarke, C., Haggerty, L., Hemrom, A., Hourlier, T., Izuogu, O. G., Juettemann, T., Kaikala, V., Kay, M., Lavidas, I., Le, T., Lemos, D., Gonzalez Martinez, J., Marug´an, J. C., Maurel, T., McMahon, A. C., Mohanan, S., Moore, B., Muffato, M., Oheh, D. N., Paraschas, D., Parker, A., Parton, A., Prosovetskaia, I., Sakthivel, M. P., Salam, A., Schmitt, B. M., Schuilenburg, H., Sheppard, D., Steed, E., Szpak, M., Szuba, M., Taylor, K., Thormann, A., Threadgold, G.,Walts, B.,Winterbottom, A., Chakiachvili, M., Chaubal, A., De Silva, N., Flint, B., Frankish, A., Hunt, S. E., IIsley, G. R., Langridge, N., Loveland, J. E., Martin, F. J., Mudge, J. M., Morales, J., Perry, E., Ruffier, M., Tate, J., Thybert, D., Trevanion, S. J., Cunningham, F., Yates, A. D., Zerbino, D. R., and Flicek, P. (2020). Ensembl 2021. Nucleic Acids Research, 49(D1):D884–D891.
Issa, M. and Elaziz, M. A. (2020). Analyzing covid-19 virus based on enhanced fragmented biological local aligner using improved ions motion optimization algorithm. Applied Soft Computing, 96:106683.
Kawade, G., Sahu, S., Upadhye, S., Korde, N., and Motghare, M. (2018). An analysis on computation of longest common subsequence algorithm.
Koonin, E. (2005). Orthologs, paralogs, and evolutionary genomics 1. Annual review of genetics, 39:309–38.
Marçais, G., Delcher, A. L., Phillippy, A. M., Coston, R., Salzberg, S. L., and Zimin, A. (2018). Mummer4: A fast and versatile genome alignment system. PLOS Computational Biology, 14:1–14.
Needleman, S. B. and Wunsch, C. D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48:443–453.
Reddy, B. (2020). Bioinformatics & Pairwise Sequence Alignment: Local and Global Sequence Alignment Algorithms. Independently published.
Shafiq, M., Polo, J., Dickov, B., and Hussain, T. (2016). Modeling and performance evaluation of smith-waterman algorithm. In 2016 13th International Bhurban Conference on Applied Sciences and Technology (IBCAST), pages 191–198.
Shikder, R., Thulasiraman, P., Irani, P., and Hu, P. (2019). An openmp-based tool for finding longest common subsequence in bioinformatics. BMC Research Notes, 12.
Wagner, R. A. and Fischer, M. J. (1974). The string-to-string correction problem. J. ACM, 21(1):168–173.
Wei, S., Wang, Y., Yang, Y., and Liu, S. (2020). A path recorder algorithm for multiple longest common subsequences (mlcs) problems. Bioinformatics (Oxford, England), 36.
Zhang, J., Misra, S., Wang, H., and Feng, W.-c. (2017). Eliminating irregularities of protein sequence search on multicore architectures. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 62–71.