Heuristics based on Adjacency Graph Packing for DCJ Distance Considering Intergenic Regions

  • Gabriel Siqueira Unicamp
  • Alexsandro Oliveira Alexandrino Unicamp
  • Andre Rodrigues Oliveira UPM
  • Zanoni Dias Unicamp

Resumo


In this work, we explore heuristics for the Adjacency Graph Packing problem, which can be applied to the Double Cut and Join (DCJ) Distance Problem. The DCJ is a rearrangement operation and the distance problem considering it is a well established method for genome comparison. Our heuristics will use the structure called adjacency graph adapted to include information about intergenic regions, multiple copies of genes in the genomes, and multiple circular or linear chromosomes. The only required property from the genomes is that it must be possible to turn one into the other with DCJ operations. We propose one greedy heuristic and one heuristic based on Genetic Algorithms. Our experimental tests in artificial genomes show that the use of heuristics is capable of finding good results that are superior to a simpler random strategy.

Referências

Alexandrino, A. O., Brito, K. L., Oliveira, A. R., Dias, U., and Dias, Z. (2021a). Reversal Distance on Genomes with Different Gene Content and Intergenic Regions Information. In Proceedings of the 8th International Conference on Algorithms for Computational Biology (AlCoB’2021), volume 12715, pages 121–133. Springer International Publishing.

Alexandrino, A. O., Oliveira, A. R., Dias, U., and Dias, Z. (2021b). Incorporating Intergenic Regions into Reversal and Transposition Distances with Indels. Journal of Bioinformatics and Computational Biology, 19(06):2140011.

Bergeron, A., Mixtacki, J., and Stoye, J. (2006). A Unifying View of Genome Rearrangements. In International Workshop on Algorithms in Bioinformatics, pages 163–173. Springer.

Brito, K. L., Jean, G., Fertin, G., Oliveira, A. R., Dias, U., and Dias, Z. (2020). Sorting by Genome Rearrangements on both Gene Order and Intergenic Sizes. Journal of Computational Biology, 27(2):156–174.

Fertin, G., Jean, G., and Tannier, E. (2017). Algorithms for Computing the Double Cut and Join Distance on both Gene Order and Intergenic Sizes. Algorithms for Molecular Biology, 12(1):16.

Fertin, G., Labarre, A., Rusu, I., Tannier, É., and Vialette, S. (2009). Combinatorics of Genome Rearrangements. Computational Molecular Biology. The MIT Press, London, England.

Mitchell, M. (2008). Introduction to Genetic Algorithms. Springer Berlin Heidelberg, Cambridge, MA, USA.

Oliveira, A. R., Brito, K. L., Alexandrino, A. O., Siqueira, G., Dias, U., and Dias, Z. (2024). Rearrangement distance problems: An updated survey. ACM Computing Surveys, 56(8).

Oliveira, A. R., Jean, G., Fertin, G., Brito, K. L., Bulteau, L., Dias, U., and Dias, Z. (2021). Sorting Signed Permutations by Intergenic Reversals. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(6):2870–2876.

Rubert, D. P., Feijão, P., Braga, M. D. V., Stoye, J., and Martinez, F. H. V. (2017). Approximating the DCJ Distance of Balanced Genomes in Linear Time. Algorithms for Molecular Biology, 12(1):1–13.

Shao, M., Lin, Y., and Moret, B. M. (2015). An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes. Journal of Computational Biology, 22(5):425–435.

Siqueira, G., Alexandrino, A. O., and Dias, Z. (2022). Signed Rearrangement Distances Considering Repeated Genes and Intergenic Regions. Proceedings of 14th International Conference on Bioinformatics and Computational Biology (BICoB’2022), 83:31–42.

Siqueira, G., Alexandrino, A. O., Oliveira, A. R., and Dias, Z. (2021a). Approximation Algorithm for Rearrangement Distances Considering Repeated Genes and Intergenic Regions. Algorithms for Molecular Biology, 16(1):1–23.

Siqueira, G., Oliveira, A. R., Alexandrino, A. O., and Dias, Z. (2021b). Heuristics for Cycle Packing of Adjacency Graphs for Genomes with Repeated Genes. In Proceedings of the 14th Brazilian Symposium on Bioinformatics (BSB’2021), pages 93–105. Springer International Publishing.

Siqueira, G., Oliveira, A. R., Alexandrino, A. O., Jean, G., Fertin, G., and Dias, Z. (2024). Assignment of orthologous genes in unbalanced genomes using cycle packing of adjacency graphs. Journal of Heuristics.

Yancopoulos, S., Attie, O., and Friedberg, R. (2005). Efficient Sorting of Genomic Permutations by Translocation, Inversion and Block Interchange. Bioinformatics, 21(16):3340–3346.
Publicado
02/12/2024
SIQUEIRA, Gabriel; ALEXANDRINO, Alexsandro Oliveira; OLIVEIRA, Andre Rodrigues; DIAS, Zanoni. Heuristics based on Adjacency Graph Packing for DCJ Distance Considering Intergenic Regions. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 17. , 2024, Vitória/ES. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 71-82. ISSN 2316-1248. DOI: https://doi.org/10.5753/bsb.2024.245554.