Coder Social home page Coder Social logo

rpetit3 / bigsi Goto Github PK

View Code? Open in Web Editor NEW

This project forked from phelimb/bigsi

0.0 3.0 0.0 2.32 MB

BItsliced Genomic Signature Index - Efficient indexing and search in very large collections of WGS data

Home Page: http://www.bigsi.io

License: MIT License

Dockerfile 0.57% Python 59.97% Shell 0.37% Jupyter Notebook 16.57% R 0.65% Ruby 21.86%

bigsi's Introduction

BItsliced Genomic Signature Index [BIGSI]

BIGSI can search a collection of raw (fastq/bam), contigs or assembly for genes, variant alleles and arbitrary sequence. It can scale to millions of bacterial genomes requiring ~3MB of disk per sample while maintaining millisecond kmer queries in the collection.

This tool was formerly named "Coloured Bloom Graph" or "CBG" in reference to the fact that it can be viewed as a coloured probabilistic de Bruijn graph.

Documentation can be found at https://bigsi.readme.io/. An index of the microbial ENA/SRA (Dec 2016) can be queried at http://www.bigsi.io.

You can read more in our preprint here: https://www.biorxiv.org/content/early/2017/12/15/234955.

Install

bigsi has a docker image that bundles mccortex, berkeley DB and BIGSI in one image. See: https://bigsi.readme.io/docs for install instructions.

Quickstart

Prepare the data

Requires mccortex.

mccortex/bin/mccortex31 build -k 31 -s test1 -1 example-data/kmers.txt example-data/test1.ctx
mccortex/bin/mccortex31 build -k 31 -s test2 -1 example-data/kmers.txt example-data/test2.ctx

Construct the bloom filters

bigsi init test-bigsi --k 31 --m 1000 --h 1

bigsi bloom --db test-bigsi -c example-data/test1.ctx example-data/test1.bloom
bigsi bloom --db test-bigsi -c example-data/test2.ctx example-data/test2.bloom

Build the combined graph

bigsi build test-bigsi example-data/test1.bloom example-data/test2.bloom -s s1 -s s2

Query the graph

bigsi search -o tsv --db test-bigsi -s CGGCGAGGAAGCGTTAAATCTCTTTCTGACG

Insert a new sample into the graph

bigsi insert test-bigsi example-data/test3.bloom s3

Quickstart with docker

docker pull phelimb/bigsi
docker run phelimb/bigsi bigsi --help

Preparing your data

BIGSI using single colour graphs to construct the coloured graph. Use mccortex to build.

PWD=`pwd`
docker run -v $PWD/example-data:/data phelimb/bigsi mccortex/bin/mccortex31 build -k 31 -s test1 -1 /data/kmers.txt /data/test1.ctx
docker run -v $PWD/example-data:/data phelimb/bigsi mccortex/bin/mccortex31 build -k 31 -s test2 -1 /data/kmers.txt /data/test2.ctx

Building a BIGSI

Construct the bloom filters

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi  init /data/test.bigsi --k 31 --m 1000 --h 1

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi bloom --db /data/test.bigsi -c /data/test1.ctx /data/test1.bloom	
docker run -v $PWD/example-data:/data phelimb/bigsi bigsi bloom --db /data/test.bigsi -c /data/test1.ctx /data/test2.bloom	

Build the combined graph

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi build /data/test.bigsi /data/test1.bloom /data/test2.bloom

Query the graph

docker run -v $PWD/example-data:/data phelimb/bigsi bigsi search --db /data/test.bigsi -s CGGCGAGGAAGCGTTAAATCTCTTTCTGACG

Citation

Please cite

Phelim Bradley, Henk den Bakker, Eduardo Rocha, Gil McVean, Zamin Iqbal
bioRxiv 234955; doi: https://doi.org/10.1101/234955 

if you use BIGSI in your work.

bigsi's People

Contributors

iqbal-lab avatar phelimb avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.