Coder Social home page Coder Social logo

amplicontig's Introduction

Coverage Status

amplicontig

Identify paired reads that match primer sequences and assemble tiled amplicons.

This approach is valid for non-fragmented sequencing protocols.

Synopsis

amplicontig 0.1.5

USAGE:
    amplicontig [FLAGS] <primers> <R1> <R2> [SUBCOMMAND]

FLAGS:
    -h, --help       Prints help information
    -p, --prefix     set the output file(s) prefix
    -v               report primer match stats
    -V, --version    Prints version information

ARGS:
    <primers>    primer set
    <R1>         R1 reads
    <R2>         R2 reads

SUBCOMMANDS:
    assemble    bin matched and merged read pairs into consensus
    help        Prints this message or the help of the given subcommand(s)
    match       match reads against a primer set
    test        test reads against a set of primers

Primer spec

example:

amplicon,name,left,forward,primer,position
nCoV-2019_1,LEFT,true,true,ACCAACCAACTTTCGATCTCTTGT,30
nCoV-2019_1,RIGHT,false,false,CATCTTTAAGATGTTGACGTGCCTC,230
nCoV-2019_1,RIGHT_alt,false,false,GTCTTTAAGATGTTGACGTGCC,229

Installation

cargo build --release

Examples

target/release/amplicontig artic-v3.csv ERR4659819_1.fastq.gz ERR4659819_2.fastq.gz test

Pipeline Description

Primer identification

The 5' ends of individual reads are used to identify primer sequences.

  • Exact string matching in O(n)
  • Or lowest hamming distance up to threshold (default: 3)

This stage reports the most prevalent inexact matches and preserves unidentified reads.

Mate polishing

Read pairs that belong to fragments that are shorter than twice the read length (eg. 500bp) will overlap at the 3' ends. These can be merged into single reads and sequencing errors are resolved as Ns.

ACGTGTGTC->
   <-TCTCACGTCG
      |
ACGTGTNTCACGTCG

Amplicon binning

Amplicons are binned and counted, Ns are removed.

    ACGTGTNTCACGTCG
    ACGTGTGTCNCGTCG
    CCCTGGCTCACANCGC


result:
    ACGTGTGTCACGTCG, 2
    CCCTGGCTCACANCGC, 1

By default, bins are discarded for

  • Fragment sizes < 50% shortest amplicon length
  • Bins with fewer than 100 members

Amplicon merging

Disagreements between forward and reverse fragments are resolved.

Amplicons that overlap according to primer position are assembled into contigs.

Output

Fasta of contigs.

GFA support is planned.

amplicontig's People

Contributors

jeff-k avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

jianshu93

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.