Phylogeny Trees as a Tool to Compare Inference Algorithms of Orthologs

Abstract

Orthologous genes are defined as genes arising from speciation events, being highly conserved in form and function. Several algorithms seek to identify them, but a simple methodology is not available to determine the quality of their results. This work proposed using the definition of orthologs and the analysis of phylogenetic trees to develop a methodology to compare these algorithms. Thirty proteomes of prokaryotes were obtained, focusing on Leifsonia and Clavibacter genera. The orthogroups were inferred using five graph-based algorithms (OMA, Orthofinder, PorthoMCL, ProteinOrtho, and Sonic Paranoid). Frequencies of each homologous group were obtained from the resulting raw data. The sequences were aligned by MUSCLE software. After that, the sequences were trimmed by the trimAl software and concatenated into supermatrices. The percentage of information for each supermatrix was calculated. The phylogenetic trees were built applying three tree reconstruction methods: Maximum Likelihood, Bayesian inference, and Neighbors-joining. The reference trees were made by 16S ribosomal RNA sequences. Furthermore, gene trees from orthogroups with taxa = 30 were inferred by the Maximum Likelihood methodology. The trees were compared to the reference tree by topology and Robinson-Foulds distances. Despite the differences in the quantity of the orthogroups obtained from each algorithm, no significant differences were observed between the constructed trees. However, previous work with other distinct species verified that this methodology may be viable. It is concluded that the proposed methodology is valid, although not to all species groups. Due to the input data dependencies, this methodology is recommended to be performed for each new data set.
Published
2022-09-21
How to Cite
OLIVEIRA, Rafael; LEITE, Saul de Castro; ALMEIDA, Fernanda Nascimento. Phylogeny Trees as a Tool to Compare Inference Algorithms of Orthologs. Proceedings of the Brazilian Symposium on Bioinformatics (BSB), [S.l.], p. 128-139, sep. 2022. ISSN 2316-1248. Available at: <https://sol.sbc.org.br/index.php/bsb/article/view/22863>. Date accessed: 17 may 2024.