ContFree-NGS: Removing Reads from Contaminating Organisms in Next Generation Sequencing Data

Resumo


We present ContFree-NGS, an open source software that removes reads originating from contaminant organisms in your sequencing dataset. The user has to provide a target taxon, and anything that does not belong to this taxon or its descendants will be labelled as contaminant. In order to achieve this, ContFree-NGS exploits results from a taxonomy assignment engine, like Kraken2 or Kaiju.
Palavras-chave: NGS, Contamination, Bioinformatics

Referências

Park, S.J., Onizuka, S., Seki, M., et al.: A systematic sequencing-based approach for microbial contaminant detection and functional inference. BMC Biol. 17, 72 (2019). https://doi.org/10.1186/s12915-019-0690-0

Goig, G.A., Blanco, S., Garcia-Basteiro, A.L., et al.: Contaminant DNA in bacterial sequencing experiments is a major source of false genetic variability. BMC Biol. 18, 24 (2020). https://doi.org/10.1186/s12915-020-0748-z

National Center for Biotechnology Information 2016: Contamination in Sequence Databases. https://www.ncbi.nlm.nih.gov/tools/vecscreen/contam/. Accessed 6 Oct 2021

Sangiovanni, M., Granata, I., Thind, A., et al.: From trash to treasure: detecting unexpected contamination in unmapped NGS data. BMC Bioinform. 20, 168 (2019). https://doi.org/10.1186/s12859-019-2684-x

Steinegger, M., Salzberg, S.L.: Terminating contamination: large-scale search identifies more than 2,000,000 contaminated entries in GenBank. Genome Biol. 21, 115 (2020). https://doi.org/10.1186/s13059-020-02023-1

Xi, W., Gao, Y., Cheng, Z., et al.: Using QC-blind for quality control and contamination screening of bacteria DNA sequencing data without reference genome. Front. Microbiol. 10, 1560 (2019). https://doi.org/10.3389/fmicb.2019.01560

Wood, D.E., Lu, J., Langmead, B.: Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019). https://doi.org/10.1186/s13059-019-1891-0

Menzel, P., Ng, K., Krogh, A.: Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat. Commun. 7, 11257 (2016). https://doi.org/10.1038/ncomms11257
Publicado
22/11/2021
PERES, Felipe Vaz; RIAÑO-PACHÓN, Diego Mauricio. ContFree-NGS: Removing Reads from Contaminating Organisms in Next Generation Sequencing Data. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 14. , 2021, Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 65-68. ISSN 2316-1248.