Optimization of internal structures memory space of k-mers repetition's frequency counter

  • Matheus P. Ferreira Mackenzie
  • Fabio T. Ishikawa Mackenzie
  • Fabricio G. Vilasbôas Mackenzie
  • Calebe P. Bianchini Mackenzie

Abstract


In this article, we present CFRK-MC, a k-mers repetition's frequency counter optimized for shared memory environments based on original CFRK. The CFRK-MC reduced the execution time by 11.5 fold comparing to the original version, while consistently reducing memory usage.

Keywords: High-Performance Applications, Performance measurements, evaluation and prediction, Operating Systems

References

Chen, Y., Ye, W., Zhang, Y., and Xu, Y. (2015). High speed BLASTN: an accelerated MegaBLAST search tool. Nucleic Acids Research, 43(16):7762–7768.

Head, S. R., Komori, H. K., LaMere, S. A., Whisenant, T., Van Nieuwerburgh, F., Salomon, D. R., and Ordoukhanian, P. (2014). Library construction for next-generation sequencing: overviews and challenges. Biotechniques, 56(2):61–77.

Marçais, G. and Kingsford, C. (2011). A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770.

Morgado, A. C. d. Q. (1991). Análise combinatória e probabilidade. Sociedade Brasileira de Matemática.

Onate, F. P., Batto, J.-M., Juste, C., Fadlallah, J., Fougeroux, C., Gouas, D., Pons, N., Kennedy, S., Levenez, F., Dore, J., et al. (2015). Quality control of microbiota metagenomics by k-mer analysis. BMC genomics, 16(1):1–10.

Vilasbôas, F. G. (2017). Método computacional baseado em gpu para contabilização de k-mers aplicado a metagenomas. Master’s thesis, Laboratório Nacional de Computação Científica, Brasil.

Zerbino, D. R. and Birney, E. (2008). Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome research, 18(5):821–829.

Zhang, Q., Pell, J., Canino-Koning, R., Howe, A. C., and Brown, C. T. (2014). These are not the k-mers you are looking for: Efficient online k-mer counting using a probabilistic data structure. PLOS ONE, 9(7):1–13.
Published
2021-05-06
FERREIRA, Matheus P.; ISHIKAWA, Fabio T.; VILASBÔAS, Fabricio G.; BIANCHINI, Calebe P.. Optimization of internal structures memory space of k-mers repetition's frequency counter. In: REGIONAL SCHOOL OF HIGH PERFORMANCE COMPUTING FROM SÃO PAULO (ERAD-SP), 12. , 2021, Evento Online. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 17-20. DOI: https://doi.org/10.5753/eradsp.2021.16695.