Coder Social home page Coder Social logo

cmu-safari / grim Goto Github PK

View Code? Open in Web Editor NEW
11.0 8.0 5.0 55 KB

Source code of the processing-in-memory simulator used in the GRIM-Filter paper published at BMC Genomics in 2018: "GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping using Processing-in-Memory Technologies" (preliminary version at https://arxiv.org/pdf/1711.01177.pdf)

Home Page: https://arxiv.org/pdf/1711.01177.pdf

C 99.84% Makefile 0.16%

grim's Introduction

GRIM-Filter

GRIM-Filter is an algorithm optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment.

Our code baseline is taken from mrFAST_v2.6.1.0, which is described in detail in the following publications:

While we use mrFAST as a baseline, GRIM-Filter can be adapted to run with any other read mapper.

The algorithm of GRIM-Filter is described at: J.S. Kim et al., GRIM-Filter: Fast Seed Location Filtering in DNA Read Mapping using Processing-in-Memory Technologies, To appear in BMC Genomics

Prerequisites

In order to run GRIM-Filter, have the following files:

  • Human Genome FASTA file (e.g., Human_g1k_v37 Genome)
  • Read Sequence data sets (FASTA file)

Getting Started

To build mrFAST with GRIM-Filter, simply do:

$ make 

To build the hash table used by mrFAST, run the following command:

./mrfast --index <Genome FASTA File>

There is more information on the parameters for hash table generation in the mrFAST User Manual.

To build the bitvectors that are referenced by GRIM-Filter, run the following command:

./mrfast --index <Genome FASTA File> -t 0 -k <Number of Bins> -b <Token Size> -f <Number of Tokens the Bitvector can Count (1)>

This will generate a .bv file in the same directory as your Genome FASTA File.

You can then use the bitvectors by running mrfast with the following command:

./mrfast --search <Genome FASTA File> -b <Token Size> -t 1 -e <error Tolerance (%)> -k <Number of Bins> -q 1 --seq <Read Sequences FASTA File>

Contributors

  • Jeremie S. Kim (Carnegie Mellon University)

grim's People

Contributors

jeremiek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grim's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.