Skip to main content

PIMBA: A PIpeline for MetaBarcoding Analysis

  • Conference paper
  • First Online:
Advances in Bioinformatics and Computational Biology (BSB 2021)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 13063))

Included in the following conference series:

Abstract

DNA metabarcoding is an emerging monitoring method capable of assessing biodiversity from environmental samples (eDNA). Advances in computational tools have been required due to the increase of Next-Generation Sequencing data. Tools for DNA metabarcoding analysis, such as MOTHUR, QIIME, Obitools, PEMA, and mBRAVE have been widely used in ecological studies, however, some difficulties are encountered when there is a need to use custom databases. Here we present PIMBA, a PIpeline for MetaBarcoding Analysis, which allows the use of customized databases, as well as other reference databases used by the software mentioned here. PIMBA is an open-source and user-friendly pipeline that consolidates all analyses in just three command lines. PIMBA’s implementation is available at https://github.com/reinator/pimba.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Creer, S., et al.: The ecologist’s field guide to sequence-based identification of biodiversity. Meth. Ecol. Evol. 7, 1008–1018 (2016). https://doi.org/10.1111/2041-210X.12574

    Article  Google Scholar 

  2. Alberdi, A., Aizpurua, O., Gilbert, M.T.P., Bohmann, K.: Scrutinizing key steps for reliable metabarcoding of environmental samples. Meth. Ecol. Evol. 9, 134–147 (2018). https://doi.org/10.1111/2041-210X.12849

    Article  Google Scholar 

  3. Schloss, P.D., et al.: Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol. 75, 7537–7541 (2009). https://doi.org/10.1128/AEM.01541-09

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Caporaso, J.G., et al.: QIIME allows analysis of high-throughput community sequencing data. Nat. Meth. 7, 335–336 (2010). https://doi.org/10.1038/nmeth.f.303

    Article  CAS  Google Scholar 

  5. Boyer, F., Mercier, C., Bonin, A., Le Bras, Y., Taberlet, P., Coissac, E.: Obitools : a unix -inspired software package for DNA metabarcoding. Mol. Ecol. Resour. 16, 176–182 (2016). https://doi.org/10.1111/1755-0998.12428

    Article  CAS  PubMed  Google Scholar 

  6. Ratnasingham, S.: mBRAVE: the multiplex barcode research and visualization environment. Biodivers. Inf. Sci. Stand. 3, e37986 (2019). https://doi.org/10.3897/biss.3.37986

    Article  Google Scholar 

  7. Zafeiropoulos, H., et al.: PEMA: a flexible pipeline for environmental DNA metabarcoding analysis of the 16S/18S ribosomal RNA, ITS, and COI marker genes. Gigascience 9, 1–12 (2020). https://doi.org/10.1093/GIGASCIENCE/GIAA022

    Article  Google Scholar 

  8. Cristescu, M.E.: From barcoding single individuals to metabarcoding biological communities: towards an integrative approach to the study of global biodiversity. Trends Ecol. Evol. 29(10), 566-571 (2014). https://doi.org/10.1016/j.tree.2014.08.001

  9. Hering, D., et al.: Implementation options for DNA-based identification into ecological status assessment under the European water framework directive. Water Res. 138, 192–205 (2018). https://doi.org/10.1016/j.watres.2018.03.003

    Article  CAS  PubMed  Google Scholar 

  10. Deiner, K., et al.: Environmental DNA metabarcoding: transforming how we survey animal and plant communities. Mol. Ecol. 26, 5872–5895 (2017). https://doi.org/10.1111/mec.14350

    Article  PubMed  Google Scholar 

  11. Callahan, B.J., McMurdie, P.J., Holmes, S.P.: Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J. 11(12), 2639–2643 (2017). https://doi.org/10.1038/ismej.2017.119

  12. DeSantis, T.Z., et al.: Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006). https://doi.org/10.1128/AEM.03006-05

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Quast, C., et al.: The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013). https://doi.org/10.1093/nar/gks1219

    Article  CAS  PubMed  Google Scholar 

  14. Abarenkov, K., et al.: The UNITE database for molecular identification of fungi – recent updates and future perspectives. https://www.jstor.org/stable/27797548. (2010). https://doi.org/10.2307/27797548

  15. Ratnasingham, S., Hebert, P.D.N.: BARCODING: bold: the barcode of life data system (http://www.barcodinglife.org). Mol. Ecol. Notes. 7, 355–364 (2007). https://doi.org/10.1111/j.1471-8286.2007.01678.x

  16. Machida, R.J., Leray, M., Ho, S.-L., Knowlton, N.: Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples. Sci. Data 41(4), 1–7 (2017). https://doi.org/10.1038/sdata.2017.27

  17. Pylro, V.S., et al.: Brazilian microbiome project: revealing the unexplored microbial diversity—challenges and prospects. Microb. Ecol. 67(2), 237–241 (2013). https://doi.org/10.1007/s00248-013-0302-4

    Article  PubMed  Google Scholar 

  18. Frøslev, T.G., et al.: Algorithm for post-clustering curation of DNA amplicon data yields reliable biodiversity estimates. Nat. Commun. 8, 1–11 (2017). https://doi.org/10.1038/s41467-017-01312-x

    Article  CAS  Google Scholar 

  19. Mahé, F., Rognes, T., Quince, C., de Vargas, C., Dunthorn, M.: Swarm v2: highly-scalable and high-resolution amplicon clustering. Peer J. 3, e1420 (2015). https://doi.org/10.7717/PEERJ.1420

    Article  PubMed  PubMed Central  Google Scholar 

  20. McMurdie, P.J., Holmes, S.: phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8, e61217 (2013). https://doi.org/10.1371/journal.pone.0061217

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Schubert, M., Lindgreen, S., Orlando, L.: AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res. Notes 91(9), 1–7 (2016). https://doi.org/10.1186/S13104-016-1900-2

  22. Zhang, J., Kobert, K., Flouri, T., Stamatakis, A.: PEAR: a fast and accurate Illumina paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014). https://doi.org/10.1093/bioinformatics/btt593

    Article  CAS  PubMed  Google Scholar 

  23. Schmieder, R., Edwards, R.: Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011). https://doi.org/10.1093/bioinformatics/btr026

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Rognes, T., Flouri, T., Nichols, B., Quince, C., Mahé, F.: VSEARCH: a versatile open source tool for metagenomics. Peer J. 4, e2584 (2016). https://doi.org/10.7717/PEERJ.2584

    Article  PubMed  PubMed Central  Google Scholar 

  25. Cole, J.R., et al.: Ribosomal database project: data and tools for high throughput rRNA analysis. Nucleic Acids Res. 42, D633–D642 (2014). https://doi.org/10.1093/NAR/GKT1244

    Article  CAS  PubMed  Google Scholar 

  26. Benson, D.A., et al.: GenBank. Nucleic Acids Res. 41, D36–D42 (2013). https://doi.org/10.1093/NAR/GKS1195

    Article  CAS  PubMed  Google Scholar 

  27. Tatusova, T.A., Madden, T.L.: BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. FEMS Microbiol. Lett. 174, 247–250 (1999). https://doi.org/10.1111/j.1574-6968.1999.tb13575.x

    Article  CAS  PubMed  Google Scholar 

  28. Bengtsson-Palme, J., et al.: Improved software detection and extraction of ITS1 and ITS2 from ribosomal ITS sequences of fungi and other eukaryotes for analysis of environmental sequencing data. Meth. Ecol. Evol. 4, 914–919 (2013). https://doi.org/10.1111/2041-210X.12073

    Article  Google Scholar 

  29. McDonald, D., et al.: The biological observation matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. Gigascience 1(1), 2047-217X (2012). https://doi.org/10.1186/2047-217X-1-7

  30. Gohl, D.M., et al.: Systematic improvement of amplicon marker gene methods for increased accuracy in microbiome studies. Nat. Biotechnol. 349(34), 942–949 (2016). https://doi.org/10.1038/nbt.3601

  31. Bakker, M.G.: A fungal mock community control for amplicon sequencing experiments. Mol. Ecol. Resour. 18, 541–556 (2018). https://doi.org/10.1111/1755-0998.12760

    Article  CAS  PubMed  Google Scholar 

  32. Bista, I., et al.: Performance of amplicon and shotgun sequencing for accurate biomass estimation in invertebrate community samples. Mol. Ecol. Resour. 18, 1020–1034 (2018). https://doi.org/10.1111/1755-0998.12888

    Article  CAS  Google Scholar 

  33. Encyclopedia of Machine Learning: Encycl. Mach. Learn. (2010). https://doi.org/10.1007/978-0-387-30164-8

    Article  Google Scholar 

  34. Toju, H., Tanabe, A.S., Yamamoto, S., Sato, H.: High-coverage ITS primers for the DNA-based identification of ascomycetes and basidiomycetes in environmental samples. PLoS ONE 7, e40863 (2012). https://doi.org/10.1371/JOURNAL.PONE.0040863

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guilherme Oliveira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Oliveira, R.R.M., Silva, R., Nunes, G.L., Oliveira, G. (2021). PIMBA: A PIpeline for MetaBarcoding Analysis. In: Stadler, P.F., Walter, M.E.M.T., Hernandez-Rosales, M., Brigido, M.M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2021. Lecture Notes in Computer Science(), vol 13063. Springer, Cham. https://doi.org/10.1007/978-3-030-91814-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91814-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91813-2

  • Online ISBN: 978-3-030-91814-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics