Coder Social home page Coder Social logo

exomepipe's Introduction

Exome Pipe

Introduction

This is a nextflow-based pipeline built to analyze exome data. There are two main pathways which the pipe will run:

  • Unrelated samples

  • Trio samples (although technically there is no limit, there could be numerous related samples)

Structure

Nextflow script containing pre-processing and instructions for running sarek and Exomiser on FastQ data.

Pipeline Flow:

Stage 1:

  1. cleanBeds: Cleans up the BED file for the sample group

  2. renameFastQs: Groups fastQ files by pairs and renames them

  3. runPipe: Runs sarek pipeline

Stage 2 (differs for multisample and triosample):

  1. produceHpoString: Produces a short string from hpo file using Proband as key

  2. produceExomiserYAML: Produces a YAML analysis file specifically for Proband

  3. produceExomiserBatch: Produce a batch.txt file which lists the analysis YAML files (only in multiSample pipe)

  4. runExomiser: Runs exomiser using batch analysis file

Use of pipe.nf

Set alias nextflow=/efs/sam/bin/nextflow

Then copy and paste pipe.nf into Haggis (vi pipe.nf - i to insert and then :wq to save and quit)

-c: Configuration file

--bed: BED file to use with samples (is always SureSelect_v6.bed)

--fastq: Directory containing .fastq files

--hpo: Directory containing hpo files for unrelated samples || Hpo file for proband in trio sample

--ped: .ped file containing pedigree.

--pipe: multiSample || trioSample

Example Commands:

nextflow run pipe.nf -profile slurm \
-c /efs/sam/configScripts/slurm.config \
--bed /efs/sam/Macrogen_HN00115050/SureSelect_v6.bed \
--fastq /efs/sam/Macrogen_HN00115050/fastq \
--hpo /efs/sam/Macrogen_HN00115050/hpo \
--ped /efs/sam/Macrogen_HN00115050/ped \
--pipe multiSample

OR

nextflow run pipe.nf -profile slurm \
-c /efs/sam/configScripts/slurm.config \
--bed /efs/sam/Macrogen_HN00115050/SureSelect_v6.bed \
--fastq /efs/sam/trio_example/fastq \
--hpo /efs/sam/trio_example/hpo/141641.hpo \
--ped /efs/sam/trio_example/ped \
--pipe trioSample

Runtime will be roughly 6-7 hours.

Input Assumptions

  • Samples are named in the format proband_1.fastq.gz
  • .bed files contain chr prefix and 2 lines of unneccessary headers.

slurm.config

Configuration parameters for slurm when running pipe.nf and nf-core/sarek.

DEPRECATED: pipe_multiSample.nf/pipe_trioSample.nf

Each contains the test pipelines for analyzing trio or multi-sample data. Replaced by --pipe tag in pipe.nf.

DEPRECATED: exomiser-template.yml

Template for exomiser analysis.

Ideas for future development

  • Add in function which automatically uploads data to database

exomepipe's People

Contributors

srdsam avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.