Coder Social home page Coder Social logo

blast-validate's Introduction

BLAST-based validation of metagenomic sequence assignments


Publication

Published at PeerJ https://peerj.com/articles/4892/

BLAST-based validation of metagenomic sequence assignments. Bazinet AL, Ondov BD, Sommer DD, Ratnayake S. PeerJ. 2018 May 28;6:e4892. doi: 10.7717/peerj.4892. eCollection 2018.

Preprint available at https://doi.org/10.1101/181636

Bazinet, A. L., Ondov, B. D., Sommer, D. D., & Ratnayake, S. (2017). BLAST-based validation of metagenomic sequence assignments. bioRxiv:181636.


Installation

  1. git clone https://github.com/bioforensics/blast-validate.git

  2. Install KronaTools

Install from source:

git clone https://github.com/marbl/Krona
cd Krona/KronaTools
./install.pl --prefix <directory that is in your PATH>
./updateTaxonomy.sh

Install using Bioconda:

conda install krona
  1. Install NCBI BLAST+

  2. Optional: download and install the ART read simulator (needed for parameter optimization)


Parameter optimization

(example using B. anthracis)

1.) Simulate reads with ART.

Example:

 art_illumina -ss HS25 -i B_anthracis_Ames.fa -p -l 250 -f 10 -m 868 -s 408 -o ba_ART_sim -1 /usr/local/packages/art_bin_MountRainier/Illumina_profiles/HiSeq2500L250R1.txt -2 /usr/local/packages/art_bin_MountRainier/Illumina_profiles/HiSeq2500L250R2.txt

 -p = paired-end  
 -l = length of reads to simulate (250 bp)  
 -f = fold-coverage (10x)  
 -m = mean fragment length  
 -s = fragment length standard deviation  
 -o = output prefix  
 -1 / -2 = custom quality profiles for R1 and R2

2.) Convert FASTQ files to FASTA format and combine R1 and R2 into a single file.

Example:

 perl scripts/convertFastqToFasta.pl < ba_ART_sim1.fq > ba_ART_sim1.fa; perl scripts/convertFastqToFasta.pl < ba_ART_sim2.fq > ba_ART_sim2.fa; cat ba_ART_sim1.fa ba_ART_sim2.fa > ba_ART_sim.fasta

3.) BLAST all reads against the NCBI nt database.

Example:

 blastn -query ba_ART_sim.fasta -task blastn -db /your/path/to/NCBI/nt -outfmt 7 > ba_ART_sim.blast

4.) Run parameter sweep.

Example:

 perl scripts/blastValidate.pl -e 0,-1,-2,-4,-8,-16,-32,-64,-128 -b 0,1,2,4,8,16,32,64,128 -x -t 1392 ba_ART_sim.fasta ba_ART_sim.blast

5.) (Optional) Perform steps 1-4 for a near neighbor genome (e.g., a B. cereus genome).


6.) Evaluate results of parameter sweep.

Example:

 perl scripts/evaluate_parameter_sweep.pl 1392 /path/to/B_anthracis/LCA/files [/path/to/near neighbor (B. cereus)/LCA/files]

Validate reads provisionally assigned to a target taxon

(example using NYC subway data and B. anthracis)

1.) BLAST reads against the NCBI nt database.

Example:

 blastn -query subway_analysis/SRR1748708_Kraken_Ba.fasta -task blastn -db /your/path/to/NCBI/nt -outfmt 7 > subway_analysis/SRR1748708_Kraken_Ba.blast

2.) Run the BLAST-validate script using parameter values previously determined to be optimal.

Example:

 perl scripts/blastValidate.pl -e -64 -b 8 -x -t 1392 subway_analysis/SRR1748708_Kraken_Ba.fasta subway_analysis/SRR1748708_Kraken_Ba.blast

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.