Efficient Out-of-Core Contig Generation
Abstract
Genome sequencing involves splitting a genome into a set reads that are assembled into contigs that are eventually ordered and organized as scaffolds. There are many programs that consider the use of the de Bruijn Graph (dBG) but they must deal with a high computational cost, mainly due to internal RAM consumption. We propose to use an external memory approach to deal with the de Bruijn graph construction focusing on contig generation. Our proposed algorithms are based on well-known I/O efficient methods that identify unitigs and remove errors such as tips and bubbles. Our analytical evaluation shows that it becomes feasible to generate de Bruijn graphs to obtain the needed contigs, independently of the available memory.
Published
2020-11-23
How to Cite
ENTENZA, Julio Omar Prieto; HAEUSLER, Edward Hermann; LIFSCHITZ, Sérgio.
Efficient Out-of-Core Contig Generation. In: BRAZILIAN SYMPOSIUM ON BIOINFORMATICS (BSB), 13. , 2020, Evento Online.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2020
.
p. 25-37.
ISSN 2316-1248.
