Coder Social home page Coder Social logo

shao-group / rnabridge-align Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 1.0 186 KB

rnabridge-align is an efficient tool to bridge paired-end RNA-seq reads

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.78% M4 2.35% C++ 96.87%
rna-seq paired-end

rnabridge-align's Introduction

Anaconda-Server Badge Anaconda-Server Badge

Overview

rnabridge-align implements an efficient algorithm to bridge paire-end RNA-seq reads, i.e., to determine the alignment of full fragments given the alignment of two mate ends. Its sister tool, rnabridge-denovo, determines the sequences of full fragments given the sequences of paired-end reads. See rnabridge-test for the evaluation of both tools.

Release

Latest release of rnabridge-align is v1.0.1.

Installation

Download the source code of rnabridge-align from here. rnabridge-align uses additional libraries of Boost and htslib. If they have not been installed in your system, you first need to download and install them. You might also need to export the runtime library path to certain environmental variable (for example, LD_LIBRARY_PATH, for most linux distributions). After install these dependencies, you then compile the source code of rnabridge-align. If some of the above dependencies are not installed to the default system directories (for example, /usr/local, for most linux distributions), their corresponding installing paths should be specified to configure of rnabridge-align.

Download Boost

If Boost has not been downloaded/installed, download Boost (license) from (http://www.boost.org). Uncompress it somewhere (compiling and installing are not necessary).

Install htslib

If htslib has not been installed, download htslib (license) from (http://www.htslib.org/) with version 1.5 or higher. (Note that htslib relies on zlib. So if zlib has not been installed in your system, you need to install zlib first.)

Use the following commands to build htslib:

./configure --disable-bz2 --disable-lzma --disable-gcs --disable-s3 --enable-libcurl=no
make
make install

The default installation location of htslib is /usr/lib. If you would install it to a different location, replace the above configure line with the following (by adding --prefix=/path/to/your/htslib to the end):

./configure --disable-bz2 --disable-lzma --disable-gcs --disable-s3 --enable-libcurl=no --prefix=/path/to/your/htslib

In this case, you also need to export the runtime library path (note that there is an additional lib following the installation path):

export LD_LIBRARY_PATH=/path/to/your/htslib/lib:$LD_LIBRARY_PATH

Compile rnabridge-align

Use the following to compile rnabridge-align:

./configure --with-htslib=/path/to/your/htslib --with-boost=/path/to/your/boost
make

If some of the dependencies are installed in the default system directory (for example, /usr/lib), then the corresponding --with- option might not be necessary. The executable file rnabridge-align will appear at src/rnabridge-align.

Usage

The usage of rnabridge-align is:

./rnabridge-align -i <input.bam> -o <output.bam> [-r reference.gtf] [options]

The input.bam is the read alignment file generated by some RNA-seq aligner, (for example, TopHat2, STAR, or HISAT2). Make sure that it is sorted; otherwise run samtools to sort it:

samtools sort input.bam > input.sort.bam

The alignment of entire fragments shall be written to output.bam.

rnabridge-align also supports making use the reference transcriptome to improve bridging accuracy. The reference transcriptome can be provided with -r reference.gtf.

rnabridge-align support the following parameters.

Parameters Default Value Description
--help print usage of rnabridge-align and exit
--version print version of rnabridge-align and exit
--preview show the inferred library_type and fragment-length-range and exit
--library_type empty chosen from {empty, unstranded, first, second} (see below)
--min_bridging_score 0.5 the minimized bottleneck weight in bridging path
--dp_solution_size 10 candidate number of bridgign paths
--dp_stack_size 5 number of weights maintained for each bridging path
--max_clustring_flank 30 maximized basepair difference for being in an equivalent class
--flank_tiny_length 10 maximized length for reconsidering error correction
--flank_tiny_ratio 0.4 maximized ratio for reconsidering error correction
--min_splice_bundary_hits 1 the minimum number of spliced reads required to support a junction
--max_num_cigar 1000 ignore reads with CIGAR size larger than this value

--library_type is highly recommended to provide. The unstranded, first, and second correspond to fr-unstranded, fr-firststrand, and fr-secondstrand used in standard Illumina sequencing libraries. If none of them is given, i.e., it is empty by default, then rnabridge-align will try to infer the library_type by itself (see --preview). Notice that such inference is based on the XS tag stored in the input bam file. If the input bam file do not contain XS tag, then it is essential to provide the library_type to rnabridge-align. You can try --preview to see the inferred library_type.

rnabridge-align's People

Contributors

shaomingfu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

shulp2211

rnabridge-align's Issues

Feature request?

Hello,

This is a great tool and exactly what I was looking for, however I was wondering if it was possible to only perform bridging or partitioning based on the read tag? I'm specifically trying to bridge single cell data and would like to only bridge reads by barcodes. Would this be easy to add?
The other solution on my end was to have a wrapper script that fed in split bams based on my barcodes but this is rather inefficient as you might imagine.

Thanks,
Chang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.