Coder Social home page Coder Social logo

davidescarpetta / taxonomic_profiling Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 67 KB

Metabarcoding analysis pipeline, using DADA2 and qiime2

Shell 100.00%
bioinformatics bioinformatics-pipeline cutadapt dada2 metabarcoding metabarcoding-data qiime2 taxonomic-profiling

taxonomic_profiling's Introduction

Taxonomic Profiling

Metabarcoding is the barcoding of DNA/RNA (or eDNA/eRNA) in a manner that allows for the simultaneous identification of many taxa within the same sample. The main difference between barcoding and metabarcoding is that metabarcoding does not focus on one specific organism, but instead aims to determine species composition within a sample.

Here I present a Bioinformatics Metabarcoding analysis pipeline, starting from raw PE fastq data, using DADA2 and qiime2. All scripts should be run after a quality control check. I recommend fastqc and multiqc.

Be sure to replace the variables with your own variables of interest.

This analysis was run on a Slurm HPC.

Use of conda and singularity is simply for convenience.

QIIME2 singularity image:

  singularity pull docker://quay.io/qiime2/amplicon:2023.9

QIIME2 conda installation:

  wget https://raw.githubusercontent.com/qiime2/distributions/dev/latest/passed/qiime2-amplicon-ubuntu-latest-conda.yml
  conda env create -n qiime2-dev --file qiime2-amplicon-ubuntu-latest-conda.yml

cutadapt

  conda install -c bioconda cutadapt

biom-format

  conda install -c bioconda biom-format

SILVA DB

Silva 138 SSURef NR99 full-length sequences and taxonomy to train the classifier are available here:

https://docs.qiime2.org/2023.9/data-resources/

Docs:

Scripts order:

- 1) Primer removal

A bash script in order to remove primer, using cutadapt

- 2) Adapter removal

A bash script in order to remove adapter, using cutadapt

- 3) Qiime import

A bash script in order to import files into a qiime artifact (.qza file), to work easily and faster on fastq files

- 4) Denoise DADA2

A bash script in order to do denoising using DADA2, output are Amplicon Sequence Variants (better than OTUs as it is said in literature)

- 5) Extract reads classifier

A bash script to extract reference reads from SILVA database using PCR primers

- 6) Training classifier

A bash script to train a Naive Bayes classifier. Output is a classifier.qza

- 7) Classifier

A bash script to test the previously trained classifier on our data

- 8) Export Filter

A bash script to:

  • export taxa barplot

  • include only sequence classified at the phylum level

  • filter out chloroplast sequence

  • export taxonomic counts at all level

  • collapse groups of features that have the same taxonomic assignment through the specified level

  • convert tables to .tsv

- 9) Tree construction

A bash script to generate tree for phylogenetic diversity analysis

- 10) Core Metrics

A bash script including core metrics method, which rarefies a feature table to a user-specified depth, computes qiime2 default alpha and beta diversity metrics, and generates PCoA plots using Emperor for each of the beta diversity metrics

- 11) Alpha metrics

A bash script to calculate metrics that are not the default, such as CHAO1, simpson, ACE.

- 12) Alpha

A bash script to do alpha group significance analysis

- 13) Beta Diversity

A bash script to do beta group significance analysis

taxonomic_profiling's People

Contributors

davidescarpetta avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.