kevinmenden / hybrid-assembly Goto Github PK

View Code? Open in Web Editor NEW

7.0 3.0 5.0 85 KB

Pipeline for hybrid assembly using short and long reads.

License: MIT License

HTML 7.53% Python 9.66% Nextflow 74.12% Shell 6.63% Dockerfile 2.07%

genome assembly nanopore hybrid-assembly nextflow

hybrid-assembly's People

Contributors

Stargazers

Watchers

Forkers

triskast dznetubingen illarionovaanastasia sajjadasaf nanyw123

hybrid-assembly's Issues

Error with canu assembly

Hi,

I am relatively new to using nextflow. I get an error when I try to use canu assembler using the following command. May I know, how to resolve this error?

$ time nextflow run kevinmenden/hybrid-assembly --shortReads '/data01/nextflow_test/1_Raw_data/Illumina/N16005/*_L2_{1,2}.fq.gz' --longReads '/data01/nextflow_test/1_Raw_data/Nanopore/N16005_Nanopore.fastq' --assembler canu --genomeSize 2.8m -profile docker
N E X T F L O W  ~  version 21.04.3
Launching `kevinmenden/hybrid-assembly` [nostalgic_wright] - revision: c2aef5f047 [master]
=========================================
 hybrid-assembly v0.3.2dev
=========================================
WARN: Access to undefined parameter `max_memory` -- Initialise it to a default value eg. `params.max_memory = some_value`
WARN: Access to undefined parameter `max_cpus` -- Initialise it to a default value eg. `params.max_cpus = some_value`
WARN: Access to undefined parameter `max_time` -- Initialise it to a default value eg. `params.max_time = some_value`
Run Name       : nostalgic_wright
Short Reads    : /data01/nextflow_test/1_Raw_data/Illumina/N16005/*_L2_{1,2}.fq.gz
Long Reads     : /data01/nextflow_test/1_Raw_data/Nanopore/N16005_Nanopore.fastq
Fasta Ref      : false
Max Memory     : null
Max CPUs       : null
Max Time       : null
Output dir     : ./results
Working dir    : /data01/nextflow_test/1_Raw_data/work
Container      : kevinmenden/hybrid-assembly:latest
Pipeline Release: master
Current home   : /home/prakki
Current user   : prakki
Current path   : /data01/nextflow_test/1_Raw_data
Script dir     : /home/prakki/.nextflow/assets/kevinmenden/hybrid-assembly
Config Profile : docker
=========================================
executor >  local (3)
[88/c0b23d] process > get_software_versions                      [  0%] 0 of 1
[4c/49b783] process > fastqc (N16005_DDMS210004243-1a_HFMWLDSX2) [  0%] 0 of 1
[25/884a7c] process > canu (N16005_Nanopore)                     [  0%] 0 of 1
[-        ] process > minimap                                    -
[-        ] process > pilon                                      -
[-        ] process > quast_canu                                 -
[-        ] process > multiqc                                    -
Error executing process > 'canu (N16005_Nanopore)'

Caused by:
  Process `canu (N16005_Nanopore)` terminated with an error exit status (1)

Command executed:

  canu \
  -p N16005_Nanopore genomeSize=2.8m -nanopore-raw N16005_Nanopore.fastq gnuplotTested=true \
  correctedErrorRate=0.144 \
  rawErrorRate=0.500 \
  minReadLength=1000 \
  minOverlapLength=500

Command exit status:
  1

Command output:
  
  usage: canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
               errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
  
    By default, all three stages (correct, trim, assemble) are computed.
    To compute only a single stage, use:
      -correct       - generate corrected reads
      -trim          - generate trimmed reads
      -assemble      - generate an assembly
      -trim-assemble - generate trimmed reads and then assemble them
  
    The assembly is computed in the (created) -d <assembly-directory>, with most
    files named using the -p <assembly-prefix>.
  
    The genome size is your best guess of the genome size of what is being assembled.
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
  
    The errorRate is not used correctly (we're working on it).  Don't set it
    If you want to change the defaults, use the various utg*ErrorRate options.
  
    A full list of options can be printed with '-options'.  All options
    can be supplied in an optional sepc file.
  
    Reads can be either FASTA or FASTQ format, uncompressed, or compressed
    with gz, bz2 or xz.  Reads are specified by the technology they were
    generated with:
      -pacbio-raw         <files>
      -pacbio-corrected   <files>
      -nanopore-raw       <files>
      -nanopore-corrected <files>
  
  Complete documentation at http://canu.readthedocs.org/en/latest/
  
  ERROR:  Directory not supplied with -d.
  ERROR:  Paramter 'correctedErrorRate' is not known.
  ERROR:  Paramter 'rawErrorRate' is not known.
executor >  local (3)
[-        ] process > get_software_versions                      -
[-        ] process > fastqc (N16005_DDMS210004243-1a_HFMWLDSX2) -
[25/884a7c] process > canu (N16005_Nanopore)                     [100%] 1 of 1, failed: 1 ✘
[-        ] process > minimap                                    -
[-        ] process > pilon                                      -
[-        ] process > quast_canu                                 -
[-        ] process > multiqc                                    -
Error executing process > 'canu (N16005_Nanopore)'

Caused by:
  Process `canu (N16005_Nanopore)` terminated with an error exit status (1)

Command executed:

  canu \
  -p N16005_Nanopore genomeSize=2.8m -nanopore-raw N16005_Nanopore.fastq gnuplotTested=true \
  correctedErrorRate=0.144 \
  rawErrorRate=0.500 \
  minReadLength=1000 \
  minOverlapLength=500

Command exit status:
  1

Command output:
  
  usage: canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
               errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
  
    By default, all three stages (correct, trim, assemble) are computed.
    To compute only a single stage, use:
      -correct       - generate corrected reads
      -trim          - generate trimmed reads
      -assemble      - generate an assembly
      -trim-assemble - generate trimmed reads and then assemble them
  
    The assembly is computed in the (created) -d <assembly-directory>, with most
    files named using the -p <assembly-prefix>.
  
    The genome size is your best guess of the genome size of what is being assembled.
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
  
    The errorRate is not used correctly (we're working on it).  Don't set it
    If you want to change the defaults, use the various utg*ErrorRate options.
  
    A full list of options can be printed with '-options'.  All options
    can be supplied in an optional sepc file.
  
    Reads can be either FASTA or FASTQ format, uncompressed, or compressed
    with gz, bz2 or xz.  Reads are specified by the technology they were
    generated with:
      -pacbio-raw         <files>
      -pacbio-corrected   <files>
      -nanopore-raw       <files>
      -nanopore-corrected <files>
  
  Complete documentation at http://canu.readthedocs.org/en/latest/
  
  ERROR:  Directory not supplied with -d.
  ERROR:  Paramter 'correctedErrorRate' is not known.
  ERROR:  Paramter 'rawErrorRate' is not known.

Command wrapper:
  
  usage: canu [-correct | -trim | -assemble | -trim-assemble] \
              [-s <assembly-specifications-file>] \
               -p <assembly-prefix> \
               -d <assembly-directory> \
               genomeSize=<number>[g|m|k] \
               errorRate=0.X \
              [other-options] \
              [-pacbio-raw | -pacbio-corrected | -nanopore-raw | -nanopore-corrected] *fastq
  
    By default, all three stages (correct, trim, assemble) are computed.
    To compute only a single stage, use:
      -correct       - generate corrected reads
      -trim          - generate trimmed reads
      -assemble      - generate an assembly
      -trim-assemble - generate trimmed reads and then assemble them
  
    The assembly is computed in the (created) -d <assembly-directory>, with most
    files named using the -p <assembly-prefix>.
  
    The genome size is your best guess of the genome size of what is being assembled.
    It is used mostly to compute coverage in reads.  Fractional values are allowed: '4.7m'
    is the same as '4700k' and '4700000'
  
    The errorRate is not used correctly (we're working on it).  Don't set it
    If you want to change the defaults, use the various utg*ErrorRate options.
  
    A full list of options can be printed with '-options'.  All options
    can be supplied in an optional sepc file.
  
    Reads can be either FASTA or FASTQ format, uncompressed, or compressed
    with gz, bz2 or xz.  Reads are specified by the technology they were
    generated with:
      -pacbio-raw         <files>
      -pacbio-corrected   <files>
      -nanopore-raw       <files>
      -nanopore-corrected <files>
  
  Complete documentation at http://canu.readthedocs.org/en/latest/
  
  ERROR:  Directory not supplied with -d.
  ERROR:  Paramter 'correctedErrorRate' is not known.
  ERROR:  Paramter 'rawErrorRate' is not known.

Work dir:
  /data01/nextflow_test/1_Raw_data/work/25/884a7c241f4f68374c5fc59fcf6d14

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

Process `spades (Q)` terminated with an error exit status (1)

Dear Kevin,
I have run hybrid assembly with docker by following commands but it ended with an error.
Could you help me to find out is it memory error or something else.

(base) ThinkStation-P910:/data$ nextflow run kevinmenden/hybrid-assembly -profile docker --shortReads '*_R{1,2}.fastq.gz' --longReads combinedall.fastq --assembler spades
executor > local (3)
[8e/a9bc99] process > get_software_versions [100%] 1 of 1 ✔
[0e/54207c] process > fastqc (Q) [100%] 1 of 1 ✔
[5a/190f77] process > spades (Q) [ 0%] 0 of 1
[- ] process > quast_spades -
[- ] process > multiqc -
Error executing process > 'spades (Q)'

Caused by:
Process spades (Q) terminated with an error exit status (1)

Command executed:

spades.py -o "spades_results" -t 16
-m 700
-1 Q_R1.fastq.gz -2 Q_R2.fastq.gz
--nanopore combinedall.fastq
-k 21,33,55,77
mv spades_results/scaffolds.fasta scaffolds.fasta
mv spades_results/contigs.fasta contigs.fasta

Command exit status:
1
.
.
.
.
.

== Error == system call for: "['/opt/conda/envs/assembly-env/share/spades-3.12.0-1/bin/spades-hammer', 'spades_results/corrected/configs/config.info']" finished abnormally, err code: -9

In case you have troubles running SPAdes, you can write to [email protected]
or report an issue on our GitHub repository github.com/ablab/spades
Please provide us with params.txt and spades.log files from the output directory.

Command error:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.

Work dir:
/media/laptop/zygopyllumtest/Q1 data/work/5a/190f77d2679c458bba66b2142565a1

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

Path string cannot be empty Missing `fromPath` parameter

I tried to use docker version but it shows the following error. Could you kindly help me to solve this.

Thank you

nextflow run kevinmenden/hybrid-assembly -profile docker --reads '/media/laptop/DBG2OLC-master/Pacbio.fasta'
N E X T F L O W ~ version 20.10.0
Launching kevinmenden/hybrid-assembly [exotic_leakey] - revision: c2aef5f [master]
Path string cannot be empty
Missing fromPath parameter

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.