Coder Social home page Coder Social logo

rnavariantcalling's Introduction

RNAvariantcalling

RNAvariantcalling is a powerful tool for calling variants from RNA sequences to provide high quality variants, which were filter by a recently-published database about editing sites in human genome.

  • RNA sequences were mapped by 2 mapper STAR and HISAT2, which can handle splice junction problem very well
  • Variants were called by Freebayes, then only variants called from both STAR and HISAT2 were stored for the next step
  • Such reliable variants were filtered one more time with a database of humnan editing sites

Version

1.0

Author

[email protected]

Dependencies

RNAvariantcalling uses a number of materials to work properly:

These will be downloaded automatically at the very first time you run rnavariantcalling.

And also, rnavariantcalling need these tools below have to be installed on your computer:

  • samtools
  • tabix
  • bgzip
  • GNU-parallel
  • Set::IntervalTree (perl)
  • URI::Escape (perl)

Installation

1. $git clone https://github.com/kspham/rnavariantcalling.git
2. $cd rnavariantcalling
3. $python setup.py install
4. $./configure (generate config.yaml)
At the very first time you run rnavariantcalling, you have to download the indexed human genome for STAR Aligner and HISAT2 Aligner, and also the gene annotation of human genome.

3. $./initial.sh

4.Done! Now you rnavariantcalling can work properly.

Usage

$ rnavariantcalling.py [-h] [--ThreadsN N] [--reads read1 read2 ... readN] [--outdir OUTDIR] --config yamlFile
-h, --helps help message and exit
--reads READS [READS ...], -U READS [READS ...] Input RNA reads
---outdir OUTDIR, -o OUTDIR Where the final result will be stored
--ThreadsN N Number of threads
--config yamlFile configuration file as yaml format

Example

- Paired-ends reads:
$ rnavariantcalling.py --ThreadsN 32 --reads 1.fastq.gz 2.fastq.gz -o /output/directory/ --config /path/to/your/config.yaml
-Single-end read:
$ rnavariantcalling.py --ThreadsN 32 -U unpaired_read.fastq.gz -o /output/directory/ --config /path/to/your/config.yaml

License

MIT

Free Software, Hell Yeah!

rnavariantcalling's People

Contributors

kspham avatar linuxpham avatar ptdtan avatar

Watchers

 avatar  avatar  avatar  avatar

rnavariantcalling's Issues

rs_id

It turns out that RSID has real problems, can you turn off the part that convert rs_id in the pipeline?
Thanks.

resource management

Seems that the rna pipeline cause a 32-cores, 128 GB RAM system to halt. Please check the resource (threads/memory) usage.

Chay xong variant calling va khong co ket qua

HI Tan,
Sau khi chay xong variant calling thi khong co ket qua gi.

/home/hoabo/smartpipe/rnavariantcalling/src/rnavariantcalling.py --ThreadsN 20 -U /media/proj3/easyrna/download/6abb83bad6243f7940813af0911f8790.fastq.gz -o /media/proj3/easyrna/output/a/b/8/d/6abb83bad6243f7940813af0911f8790/vcl

-rw-r--r-- 1 root root 525121146 Nov 16 22:35 /media/proj3/easyrna/download/6abb83bad6243f7940813af0911f8790.fastq.gz

Thanks !!!

navariantcalling.py error

Hi Tan,
Co 1 so loi trong code PYTHON navariantcalling.py vi chua remove mot so line conflict cua GIT

err35

Em check lai nhe. Thanks em !!

readme

how to use the pipeline.

Can logging xuong file

Hi Tan,

Em nen output tien trinh pipeline bang log file

import logging
import import logging.handlers

Create Logger instance

def initLogger():
oLogger = logging.getLogger("RNAVARIANTCALLING")
oLogger.setLevel(logging.DEBUG)
oLoggerHandler = logging.handlers.RotatingFileHandler("./debug.log", maxBytes=10485760, backupCount=10)
oFormatter = logging.Formatter("%(levelname)s : [%(asctime)s] - [%(filename)s at line (%(lineno)d) of (%(funcName)s) function] - [%(message)s]")
oLoggerHandler.setFormatter(oFormatter)
oLogger.addHandler(oLoggerHandler)

Main function handler

def main():
###Khoi tao LOGGER
initLogger()

###COI logging tren python (search di nhe)
oLogger = logging.getLogger("RNAVARIANTCALLING")
oLogger.info("AAAAA")

Thanks em !!!

freebayes for rna-seq

RNA-seq in some areas have very high coverage or very low coverage (very different from DNA). This causes the memory intensive scenario that we observed.
We should subsample these high coverage regions before moving to freebayes.

snpEff

Integrate snpEff to the pipeline to annotate the effect of variants.

Log file tao ngay thu muc chay tool

Hi Tan,

hien tai co van de la log files khi chay pipeline tao tu dong tai thu muc hien tai

VD : anh dang thu muc /home/hoa, chay pipeline thi file logs se duoc tao tai day (du git clone vao thu muc rnavariantcalling

screen shot 2015-12-14 at 20 40 45

Em add them 1 param do add log directory nua nhe.

Thanks !!!

Code dùng os.system, để execute 1 command như vcftool, hisat2 ...

Hi Tân,
Code pipeline dùng os.system để execute 1 external command có 2 vấn đề :

  • ko security vì chưa escape shell command

  • ko handle được trạng thái của external command
    Anh suggestion dùng thêm 2 hàm này :

    1/ execute a command

    def executeCommand(sCommand):
    ###Get all output data
    outData, errData = subprocess.Popen(sCommand, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True).communicate()

    ###Get all response data
    for lineData in outData.splitlines():
        if(self.RUNNING_DEBUG_FLAG == 1):
            outStringData = str(lineData)
            print("%s" % (outStringData))
    
    ###If there is error
    if((errData != None) and (len(errData) > 0)):
        print("Command has error:{0}".format(errData))
    

2/ escape shell script

def shellEscape(s):

return s.replace("(","(").replace(")",")")

Thanks !!!

Annotated 0 bytes

Hi Tan,

Sau khi git pull code ve, anh da chay tuan tu cac lenh sau :

  • sudo python setup.py install
  • ./configure
  • ./initial.sh

Va sau do, submit lai workspace, data output 0 byte

screen shot 2015-12-09 at 09 06 14

Thanks !!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.