Integrated genome and gene diversity analyses of bacterial genera in metagenomic data
Resumo
We present an analysis of subspecies diversity of the bacterial genera Xanthomonas, Acinetobacter, and Stenotrophomonas present in the MetaSUB metagenomic dataset. We used the simulated rarefaction curve technique along with the non-parametrics estimators Chao1, Chao2, Jackknife, ICE, and ACE, to estimate the total number of subspecies (subgroups) within the MetaSUB dataset. Subgroup is an operational concept that can be used to distinguish between genomes from the same species. For Xanthomonas, our results suggest that MetaSUB may have more subgroups than currently known; for Acinetobacter, the number of estimated subgroups is lower than the known number; and for Stenotrophomonas, the estimates and the known number are the same. A comparison of the pangenomes of Xanthomonas obtained from our genome database and that from MetaSUB showed that the genic diversity in MetaSUB is much lower than the database diversity, suggesting that urban environments constrain the genic diversity compared to the biosphere as a whole.
Palavras-chave:
microbiome, metagenome, taxonomic classification, subspecies, Xanthomonas, Acinetobacter, Stenotrophomonas, metaSUB
Referências
An, S.-Q., Potnis, N., Dow, M., Vorhölter, F.-J., He, Y.-Q., Becker, A., Teper, D., Li, Y., Wang, N., Bleris, L., & Tang, J.-L. (2019). Mechanistic insights into host adaptation, virulence and epidemiology of the phytopathogen Xanthomonas. FEMS Microbiology Reviews, 44(1), 1–32.
Danko, D., Bezdan, D., Afshin, E. E., Ahsanuddin, S., Bhattacharya, C., Butler, D. J., Chng, K. R., Donnellan, D., Hecht, J., Jackson, K., Kuchin, K., Karasikov, M., Lyons, A., Mak, L., Meleshko, D., Mustafa, H., Mutai, B., Neches, R. Y., Ng, A., & Nikolayeva, O. (2021). A global metagenomic map of urban microbiomes and antimicrobial resistance. Cell, 0(0).
Elena Schmitz, J., & Rahmann, S. (2025). A comprehensive review and evaluation of species richness estimation. Briefings in Bioinformatics, 26(2).
Gotelli, N.J. and Colwell, R.K. (2011) Chapter 4. Estimating Species Richness. In: Magurran, A.E. and McGill, B.J., Eds., Biological Diversity: Frontiers in Measurement and Assessment, Oxford University Press, New York, 39-54.
Magurran A. E. (2007). Species abundance distributions over time. Ecology letters, 10(5), 347–354.
Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N. N., Anderson, I. J., Cheng, J.-F., Darling, A., Malfatti, S., Swan, B. K., Gies, E. A., Dodsworth, J. A., Hedlund, B. P., Tsiamis, G., Sievert, S. M., Liu, W.-T., Eisen, J. A., Hallam, S. J., Kyrpides, N. C., Stepanauskas, R., & Rubin, E. M. (2013). Insights into the phylogeny and coding potential of microbial dark matter. Nature, 499(7459), 431–437.
Roumpeka, D. D., Wallace, R. J., Escalettes, F., Fotheringham, I., & Watson, M. (2017). A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data. Frontiers in Genetics, 8.
Seemann T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069.
Solano, A., & Setubal, J. (2024). A computational pipeline for species- and strain-level classification of metagenomic sequences. In Proceedings of the 17th Brazilian Symposium on Bioinformatics, pp. 155-166. Porto Alegre: SBC.
Steen, A. D., Crits-Christoph, A., Carini, P., DeAngelis, K. M., Fierer, N., Lloyd, K. G., & Cameron Thrash, J. (2019). High proportions of bacteria and archaea across most biomes remain uncultured. The ISME Journal, 13(12), 3126–3130.
Sunagawa, S., Acinas, S. G., Bork, P., Bowler, C., Eveillard, D., Gorsky, G., Guidi, L., Iudicone, D., Karsenti, E., Lombard, F., Ogata, H., Pesant, S., Sullivan, M. B., Wincker, P., & de Vargas, C. (2020). Tara Oceans: towards global ocean ecosystems biology. Nature Reviews Microbiology, 18.
Tettelin, H., Riley, D., Cattuto, C., & Medini, D. (2008). Comparative genomics: the bacterial pan-genome. Current opinion in microbiology, 11(5), 472–477.
Tonkin-Hill, G., MacAlasdair, N., Ruis, C., Weimann, A., Horesh, G., Lees, J. A., Gladstone, R. A., Lo, S., Beaudoin, C., Floto, R. A., Frost, S. D. W., Corander, J., Bentley, S. D., & Parkhill, J. (2020). Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biology, 21(1).
Danko, D., Bezdan, D., Afshin, E. E., Ahsanuddin, S., Bhattacharya, C., Butler, D. J., Chng, K. R., Donnellan, D., Hecht, J., Jackson, K., Kuchin, K., Karasikov, M., Lyons, A., Mak, L., Meleshko, D., Mustafa, H., Mutai, B., Neches, R. Y., Ng, A., & Nikolayeva, O. (2021). A global metagenomic map of urban microbiomes and antimicrobial resistance. Cell, 0(0).
Elena Schmitz, J., & Rahmann, S. (2025). A comprehensive review and evaluation of species richness estimation. Briefings in Bioinformatics, 26(2).
Gotelli, N.J. and Colwell, R.K. (2011) Chapter 4. Estimating Species Richness. In: Magurran, A.E. and McGill, B.J., Eds., Biological Diversity: Frontiers in Measurement and Assessment, Oxford University Press, New York, 39-54.
Magurran A. E. (2007). Species abundance distributions over time. Ecology letters, 10(5), 347–354.
Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N. N., Anderson, I. J., Cheng, J.-F., Darling, A., Malfatti, S., Swan, B. K., Gies, E. A., Dodsworth, J. A., Hedlund, B. P., Tsiamis, G., Sievert, S. M., Liu, W.-T., Eisen, J. A., Hallam, S. J., Kyrpides, N. C., Stepanauskas, R., & Rubin, E. M. (2013). Insights into the phylogeny and coding potential of microbial dark matter. Nature, 499(7459), 431–437.
Roumpeka, D. D., Wallace, R. J., Escalettes, F., Fotheringham, I., & Watson, M. (2017). A Review of Bioinformatics Tools for Bio-Prospecting from Metagenomic Sequence Data. Frontiers in Genetics, 8.
Seemann T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics (Oxford, England), 30(14), 2068–2069.
Solano, A., & Setubal, J. (2024). A computational pipeline for species- and strain-level classification of metagenomic sequences. In Proceedings of the 17th Brazilian Symposium on Bioinformatics, pp. 155-166. Porto Alegre: SBC.
Steen, A. D., Crits-Christoph, A., Carini, P., DeAngelis, K. M., Fierer, N., Lloyd, K. G., & Cameron Thrash, J. (2019). High proportions of bacteria and archaea across most biomes remain uncultured. The ISME Journal, 13(12), 3126–3130.
Sunagawa, S., Acinas, S. G., Bork, P., Bowler, C., Eveillard, D., Gorsky, G., Guidi, L., Iudicone, D., Karsenti, E., Lombard, F., Ogata, H., Pesant, S., Sullivan, M. B., Wincker, P., & de Vargas, C. (2020). Tara Oceans: towards global ocean ecosystems biology. Nature Reviews Microbiology, 18.
Tettelin, H., Riley, D., Cattuto, C., & Medini, D. (2008). Comparative genomics: the bacterial pan-genome. Current opinion in microbiology, 11(5), 472–477.
Tonkin-Hill, G., MacAlasdair, N., Ruis, C., Weimann, A., Horesh, G., Lees, J. A., Gladstone, R. A., Lo, S., Beaudoin, C., Floto, R. A., Frost, S. D. W., Corander, J., Bentley, S. D., & Parkhill, J. (2020). Producing polished prokaryotic pangenomes with the Panaroo pipeline. Genome Biology, 21(1).
Publicado
29/09/2025
Como Citar
GABAS, Mariana Louise; BARRIOS SOLANO, Arthur Henrique; SETUBAL, João Carlos.
Integrated genome and gene diversity analyses of bacterial genera in metagenomic data. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 18. , 2025, Fortaleza/CE.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 228-233.
ISSN 2316-1248.
DOI: https://doi.org/10.5753/bsb.2025.14573.
