Efficient Out-of-Core Contig Generation

Resumo


Genome sequencing involves splitting a genome into a set reads that are assembled into contigs that are eventually ordered and organized as scaffolds. There are many programs that consider the use of the de Bruijn Graph (dBG) but they must deal with a high computational cost, mainly due to internal RAM consumption. We propose to use an external memory approach to deal with the de Bruijn graph construction focusing on contig generation. Our proposed algorithms are based on well-known I/O efficient methods that identify unitigs and remove errors such as tips and bubbles. Our analytical evaluation shows that it becomes feasible to generate de Bruijn graphs to obtain the needed contigs, independently of the available memory.
Publicado
23/11/2020
Como Citar

Selecione um Formato
ENTENZA, Julio Omar Prieto; HAEUSLER, Edward Hermann; LIFSCHITZ, Sérgio. Efficient Out-of-Core Contig Generation. In: SIMPÓSIO BRASILEIRO DE BIOINFORMÁTICA (BSB), 13. , 2020, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2020 . p. 25-37. ISSN 2316-1248.