Coder Social home page Coder Social logo

viridian_workflow's Introduction

Build Status

Viridian Workflow

Please see the Viridian Workflow Wiki for full documentation.

Installation

The recommended method is to use a pre-built Docker or Singularity container (see the wiki for how to build your own).

Both the Docker and Singularity container have the main script viridian_workflow installed.

Docker

Get a Docker image of the latest release:

docker pull ghcr.io/iqbal-lab-org/cte:latest

All Docker images are listed in the packages page.

Singularity

Releases include a Singularity image to download. Each release has a singularity image file called viridian_workflow_vX.Y.Z.img, where X.Y.Z is the release version.

Usage

To run on paired Illumina reads:

viridian_workflow run_one_sample \
  --tech illumina
  --ref_fasta data/MN908947.fasta \
  --reads1 reads_1.fastq.gz \
  --reads2 reads_2.fastq.gz \
  --outdir OUT

To run on unpaired nanopore reads:

viridian_workflow run_one_sample \
  --tech ont
  --ref_fasta data/MN908947.fasta \
  --reads reads.fastq.gz \
   --outdir OUT

The FASTA file in those commands can be found in the viridian_workflow/amplicon_scheme_data/ directory of this repository.

Other options:

  • --sample_name MY_NAME: use this to change the sample name (default is "sample") that is put in the final FASTA file, BAM file, and VCF file.
  • --keep_bam: use this option to keep the BAM file of original input reads mapped to the reference genome.
  • --force: use with caution - it will overwrite the output directory if it already exists.

Pipeline

flowchart TD
    A[Map] --> B[Identify Amplicons];
    B --> C[Downsample Reads];
    C --> D[Identify Primers];
    D --> E[Amplicon Assembly];
    E --> F[Consensus QC];
    D --> F;
    B -- reference_mapped.bam --> G;
    F -- consensus.fa --> G;
Loading

Output files

The default files in the output directory are:

  • consensus.fa: a FASTA file of the consensus sequence.
  • variants.vcf: a VCF file of the identified variants between the consensus sequence and the reference genome.
  • log.json: contains logging information for the viridian workflow run.

If the option --keep_bam is used, then a sorted BAM file of the reads mapped to the reference will also be present, called reference_mapped.bam (and its index file reference_mapped.bam.bai).

If the option --dump_tsv is used, a per-position table of statistics will be saved as all_stats.tsv.

viridian_workflow's People

Contributors

jeff-k avatar martinghunt avatar bede avatar

Stargazers

Zamin Iqbal avatar

Watchers

Zamin Iqbal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.