Coder Social home page Coder Social logo

kallisto-nf's Introduction

Kallisto-NF

A Nextflow implementation of Kallisto & Sleuth RNA-Seq Tools

nextflow Build Status

Quick start

Make sure you have all the required dependencies listed in the last section.

Install the Nextflow runtime by running the following command:

$ curl -fsSL get.nextflow.io | bash

When done, you can launch the pipeline execution by entering the command shown below:

$ nextflow run cbcrg/kallisto-nf

By default the pipeline is executed against the provided example dataset. Check the Pipeline parameters section below to see how enter your data on the program command line.

Pipeline parameters

--reads

  • Specifies the location of the reads fastq file(s).
  • Multiple files can be specified using the usual wildcards (*, ?), in this case make sure to surround the parameter string value by single quote characters (see the example below)
  • It must end in .fastq.
  • Involved in the task: kallisto-mapping.
  • By default it is set to the Kallisto-NF's location: ./tutorial/data/*.fastq

Example:

$ nextflow run cbcrg/kallisto-nf --reads '/home/dataset/*.fastq'

This will handle each fastq file as a seperate sample.

Read pairs of samples can be specified using the glob file pattern. Consider a more complex situation where there are three samples (A, B and C), with A and B being paired reads and C being single ended. The read files could be:

sample_A_1.fastq
sample_A_2.fastq
sample_B_1.fastq
sample_B_2.fastq 
sample_C_1.fastq

The reads may be specified as below:

$ nextflow run cbcrg/kallisto-nf --reads '/home/dataset/sample_*_{1,2}.fastq'    

--transcriptome

  • The location of the transcriptome multi-fasta file.
  • It should end in .fa
  • Involved in the task: kallisto-index.
  • By default it is set to the Kallisto-NF's localization: ./tutorial/data/transcriptome/trascriptome.fa

Example:

$ nextflow run cbcrg/kallisto-nf --transcriptome /home/user/my_transcriptome/example.fa

--experiment

  • Specifies the location of the experimental design file.
  • The experimental design file provides Seulth with a link between the samples, conditions and replicates for abundance testing.
  • By default it is set to the Kallisto-NF's location: ./tutorial/experiment/high_seqinfo.txt

Example:

$ nextflow run cbcrg/kallisto-nf --experiment '/home/experiment/exp_design.txt'

The experiment file should be a text file, space delimited, in a format similar to as shown below:

run_accession condition sample
SRR493366 control A
SRR493367 control B
SRR493368 control C
SRR493369 HOXA1KD A
SRR493370 HOXA1KD B
SRR493371 HOXA1KD C

--fragment_len

  • Specifies the average fragment length of the RNA-Seq library.
  • This is required for mapping single-ended reads.
  • Involved in the task: kallisto-mapping.
  • By default is set 180.

Example:

$ nextflow run cbcrg/kallisto-nf --fragment_len 180

--fragment_sd

  • Specifies the standard deviation of the fragment length in the RNA-Seq library.
  • This is required for mapping single-ended reads.
  • Involved in the task: kallisto-mapping.
  • By default this is set 20.

Example:

$ nextflow run cbcrg/kallisto-nf --fragment_sd 180

--bootstrap

  • Specifies the number of bootstrap samples for quantification of abundances.
  • Involved in the task: kallisto-mapping.
  • By default this is set 100.

Example:

$ nextflow run cbcrg/kallisto-nf --bootstrap 100

--output

  • Specifies the folder where the results will be stored for the user.
  • It does not matter if the folder does not exist.
  • By default is set to Kallisto-NF's folder: ./results

Example:

$ nextflow run cbcrg/kallisto-nf --output /home/user/my_results 

Cluster support

Kallisto-NF execution relies on Nextflow framework which provides an abstraction between the pipeline functional logic and the underlying processing system.

Thus it is possible to execute it on your computer or any cluster resource manager without modifying it.

Currently the following platforms are supported:

  • Oracle/Univa/Open Grid Engine (SGE)
  • Platform LSF
  • SLURM
  • PBS/Torque

By default the pipeline is parallelized by spawning multiple threads in the machine where the script is launched.

To submit the execution to a SGE cluster create a file named nextflow.config, in the directory where the pipeline is going to be launched, with the following content:

process {
  executor='sge'
  queue='<your queue name>'
}

When doing that, tasks will be executed through the qsub SGE command, and so your pipeline will behave like any other SGE job script, with the benefit that Nextflow will automatically and transparently manage the tasks synchronisation, file(s) staging/un-staging, etc.

Alternatively the same declaration can be defined in the file $HOME/.nextflow/config.

To lean more about the avaible settings and the configuration file read the Nextflow documentation.

Dependencies

  • Nextflow (0.24.0 or higher)
  • Docker (alternatively you will need to install the software packages listed here)

kallisto-nf's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kallisto-nf's Issues

input names different

Hi,
Should index be not called transcriptome.index in line because in line it used as transcriptome.index rather than index?

Michal

Problem with docker image

I try to run this worfkflow but I hva e this error:

$ nextflow  run cbcrg/kallisto-nf  --experiment /home/maurizio/Desktop/kallisto-nf/tutorial/experiment/hiseq_info.txt 

warm up] executor > local
[e2/eee5fa] Submitted process > index
ERROR ~ Error executing process > 'index'

Caused by:
  Process `index` terminated with an error exit status (127)

Command executed:

  kallisto index -i transcriptome.index transcriptome.fa

Command exit status:
  127

Command output:
  (empty)

Command error:
  .command.sh: line 2: kallisto: command not found

Work dir:
  /home/maurizio/Desktop/kallisto-nf/work/e2/eee5faa8a66f6bbeb7f1cc01450c73

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

 -- Check '.nextflow.log' file for details

Error: unexpected end of input

Hi,
Did you run into the following error:

  reading in kallisto results
  dropping unused factor levels
  ..............................
  normalizing est_counts
  33052 targets passed the filter
  normalizing tpm
  merging in metadata
  summarizing bootstraps
  ..............................
  fitting measurement error models
  shrinkage estimation
  computing variance of betas
  fitting measurement error models
  shrinkage estimation
  computing variance of betas
  Error: unexpected end of input
  Execution halted

What did I miss?

Thank you in advance.

Michal

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.