Coder Social home page Coder Social logo

biowdl / germline-dna Goto Github PK

View Code? Open in Web Editor NEW
23.0 8.0 6.0 9.14 MB

A BioWDL variantcalling pipeline for germline DNA data. Starting with FASTQ files to produce VCF files. Category:Multi-Sample

Home Page: https://biowdl.github.io/germline-DNA/

License: MIT License

WDL 92.40% Python 7.60%
pipeline variantcalling multisample wdl gatk-bestpractices gatk4 germline-variants somatic-variants structural-variation

germline-dna's Introduction

germline-DNA

This repository contains the Biowdl workflow usable for processing germline-DNA data. Starting with FASTQ files and resulting in VCF files. Can switch between joint genotyping and single sample calling modes. It can also call the X and Y chromosomes with the correct ploidy if given BED files for the non-PAR regions.

Documentation

Documentation for this workflow can be found here.

About

This workflow is part of Biowdl developed by the SASC team at Leiden University Medical Center.

Contact

For any question related to germline-DNA, please use the github issue tracker or contact the SASC team directly at: [email protected].

germline-dna's People

Contributors

amfgcp avatar cagaser avatar davycats avatar ffinfo avatar galaxy001 avatar hailiangmei avatar jasperboom avatar rhpvorderman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

germline-dna's Issues

options.json

In the documentation, the example for options.json is:

{
  "final_workflow_outputs_dir": "/path/to/outputs",
  "use_relative_output_paths": true,
}

The comma after true needs to be removed or else Cromwell expects another option.

Exception in thread "main" cromwell.CromwellEntryPoint$$anon$2: ERROR: Unable to submit workflow to Cromwell::
Unexpected character '}' at input index 125 (line 4, position 1), expected '"':
}
^

        at cromwell.CromwellEntryPoint$.$anonfun$validOrFailSubmission$1(CromwellEntryPoint.scala:274)
        at cats.data.Validated.valueOr(Validated.scala:110)
        at cromwell.CromwellEntryPoint$.validOrFailSubmission(CromwellEntryPoint.scala:274)
        at cromwell.CromwellEntryPoint$.validateRunArguments(CromwellEntryPoint.scala:270)
        at cromwell.CromwellEntryPoint$.runSingle(CromwellEntryPoint.scala:66)
        at cromwell.CromwellApp$.runCromwell(CromwellApp.scala:14)
        at cromwell.CromwellApp$.delayedEndpoint$cromwell$CromwellApp$1(CromwellApp.scala:25)
        at cromwell.CromwellApp$delayedInit$body.apply(CromwellApp.scala:3)
        at scala.Function0.apply$mcV$sp(Function0.scala:39)
        at scala.Function0.apply$mcV$sp$(Function0.scala:39)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
        at scala.App.$anonfun$main$1$adapted(App.scala:80)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at scala.App.main(App.scala:80)
        at scala.App.main$(App.scala:78)
        at cromwell.CromwellApp$.main(CromwellApp.scala:3)
        at cromwell.CromwellApp.main(CromwellApp.scala)

bwa index wrong

version: germline_v5.0.0,cromwell 40
Because I don't have docker permissions, I run locally
Encountered an error about index during the bwa process

bwa mem \
>   -t 20 \
>    \
>   -R '@RG\tID:wgs1-paired-end-lib1-rg1\tLB:lib1\tSM:wgs1-paired-end\tPL:illumina' \
>   /lihaicheng/project10/germline_v5.0.0/cromwell-executions/Germline/847d978c-7bff-47c6-8aab-cdc02b87d308/call-sampleWorkflow/shard-0/SampleWorkflow/a67eafdd-e8bd-4c4e-bf0f-e6a37d561390/call-bwaMem/shard-0/inputs/-1499169405/GRCh38_full_analysis_set_plus_decoy_hla.fa \
>   /lihaicheng/project10/germline_v5.0.0/cromwell-executions/Germline/847d978c-7bff-47c6-8aab-cdc02b87d308/call-sampleWorkflow/shard-0/SampleWorkflow/a67eafdd-e8bd-4c4e-bf0f-e6a37d561390/call-bwaMem/shard-0/inputs/1238596114/V350140077_L01_read_1.fq.gz \
>  /lihaicheng/project10/germline_v5.0.0/cromwell-executions/Germline/847d978c-7bff-47c6-8aab-cdc02b87d308/call-sampleWorkflow/shard-0/SampleWorkflow/a67eafdd-e8bd-4c4e-bf0f-e6a37d561390/call-bwaMem/shard-0/inputs/1238596114/V350140077_L01_read_2.fq.gz

[E::bwa_idx_load_from_disk] fail to locate the index files
Inputs directory
/lihaicheng/project10/germline_v5.0.0/cromwell-executions/Germline/847d978c-7bff-47c6-8aab-cdc02b87d308/call-sampleWorkflow/shard-0/SampleWorkflow/a67eafdd-e8bd-4c4e-bf0f-e6a37d561390/call-bwaMem/shard-0/inputs

.
├── 1238596114
│   ├── V350140077_L01_read_1.fq.gz
│   └── V350140077_L01_read_2.fq.gz
├── -1359520590
│   ├── GRCh38_full_analysis_set_plus_decoy_hla.fa.amb
│   ├── GRCh38_full_analysis_set_plus_decoy_hla.fa.ann
│   ├── GRCh38_full_analysis_set_plus_decoy_hla.fa.bwt
│   ├── GRCh38_full_analysis_set_plus_decoy_hla.fa.pac
│   └── GRCh38_full_analysis_set_plus_decoy_hla.fa.sa
└── -1499169405
    └── GRCh38_full_analysis_set_plus_decoy_hla.fa

It seems to be caused by the index not being in the same folder as fasta, but I don't understand why I get this error

Controlling number of jobs

Hello,

I have set concurrent-job-limit = 8 in my Cromwell config file to try to limit the number of concurrent jobs. However when it comes to the Mutect2 and VarDict steps in the Somatic pipeline, >8 jobs are running at the same time. This is a problem because all the memory is used up.

2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:21:1]: job id: 533398
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:26:1]: job id: 533404
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:9:1]: job id: 533400
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:8:1]: job id: 533429
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:15:1]: job id: 533528
2020-12-10 16:43:44,087 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:19:1]: job id: 533556
2020-12-10 16:43:44,085 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:12:1]: job id: 533514
2020-12-10 16:43:44,087 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:1:1]: job id: 533544
2020-12-10 16:43:44,085 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:6:1]: job id: 533545
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:16:1]: job id: 533490
2020-12-10 16:43:44,087 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(7039cc98)Mutect2.mutect2:2:1]: job id: 533798
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:7:1]: job id: 533495
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:24:1]: job id: 533481
2020-12-10 16:43:44,085 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:5:1]: job id: 533533
2020-12-10 16:43:44,085 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:22:1]: job id: 533525
2020-12-10 16:43:44,085 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:14:1]: job id: 533524
2020-12-10 16:43:44,088 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(7039cc98)Mutect2.mutect2:1:1]: job id: 534329
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:17:1]: job id: 533494
2020-12-10 16:43:44,082 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:20:1]: job id: 533421
2020-12-10 16:43:44,086 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(7039cc98)Mutect2.mutect2:0:1]: job id: 533756
2020-12-10 16:43:44,083 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:2:1]: job id: 533426
2020-12-10 16:43:44,086 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:18:1]: job id: 533470
2020-12-10 16:43:44,086 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(7039cc98)Mutect2.mutect2:3:1]: job id: 533760
2020-12-10 16:43:44,086 INFO  - BackgroundConfigAsyncJobExecutionActor [UUID(5ce93106)VarDict.varDict:13:1]: job id: 533573

Do you have any advice on how to prevent this? My WDL input file follows this example and my Cromwell config file follows this but with concurrent-job-limit = 8. I did generate a WDL input file from somatic.wdl but I'm not sure which parameters I should set.

Many thanks,
Dave

Release 4.0.0

  • Check outstanding issues on JIRA and Github
  • Update all submodules to latest master with: git submodule foreach "git checkout master;git pull; git submodule foreach --recursive 'git fetch'; git submodule update --init --recursive"
  • check all submodules are tagged correctly with git submodule
  • run tests to confirm to be released version works.
  • Generate inputs overview using wdl-aid:
    wdl-aid --strict -t scripts/docs_template.md.j2 pipeline.wdl > docs/inputs.md
  • Publish documentation (updateDocs.sh) from develop branch
    • Copy docs folder to gh-pages branch
    • Overwrite existing develop folder with docs folder on gh-pages
    • Push changes to gh-pages branch
  • Check latest documentation
    looks fine
  • Change current development version in CHANGELOG.md to stable version.
  • Run the release script release.sh
    • Check all submodules are tagged
    • Merge the develop branch into master
    • Created an annotated tag with the stable version number. Include changes
      from changelog.md.
    • Confirm or set stable version to be used for tagging
    • Push tag to remote.
    • Merge master branch back into develop.
    • Add updated version number to develop
  • Publish documentation (updateDocs.sh) from master branch
    • Copy docs folder to gh-pages branch
    • Rename docs to new stable version on gh-pages
    • Set latest version to new version
    • Push changes to gh-pages branch
  • Create a new release from the pushed tag on github
  • Prepare the repo for packaging by git checkout master && git submodule update --init --recursive
    • Package the wdl files with wdl-packager --reproducible -a LICENSE -a dockerImages.yml <WDL_FILE>
    • Add the package(s) to the github release. Also add the original WDL file
      as <pipeline>_<version>.wdl following the same naming as the package.
      This alllows for usage of wdl and imports zip with cromwell without
      requiring the user to extract the package.

Release 2.0.0

Release checklist

  • Check outstanding issues on JIRA and Github
  • Generate inputs overview using wdl-aid:
    wdl-aid --strict -t scripts/docs_template.md.j2 pipeline.wdl > docs/inputs.md
  • Publish documentation (updateDocs.sh) from develop branch
    • Copy docs folder to gh-pages branch
    • Overwrite existing develop folder with docs folder on gh-pages
    • Push changes to gh-pages branch
  • Check latest documentation
    looks fine
  • Update all submodules to latest master with: git submodule foreach "git checkout master;git pull; git submodule foreach --recursive 'git fetch'; git submodule update --init --recursive"
  • check all submodules are tagged correctly with git submodule
  • run tests to confirm to be released version works.
  • Change current development version in CHANGELOG.md to stable version.
  • Run the release script release.sh
    • Check all submodules are tagged
    • Merge the develop branch into master
    • Created an annotated tag with the stable version number. Include changes
      from changelog.md.
    • Confirm or set stable version to be used for tagging
    • Push tag to remote.
    • Merge master branch back into develop.
    • Add updated version number to develop
  • Publish documentation (updateDocs.sh) from master branch
    • Copy docs folder to gh-pages branch
    • Rename docs to new stable version on gh-pages
    • Set latest version to new version
    • Push changes to gh-pages branch
  • Create a new release from the pushed tag on github

release3.0.0

Release checklist

  • Check outstanding issues on JIRA and Github
  • Update all submodules to latest master with: git submodule foreach "git checkout master;git pull; git submodule foreach --recursive 'git fetch'; git submodule update --init --recursive"
  • check all submodules are tagged correctly with git submodule
  • run tests to confirm to be released version works.
  • Generate inputs overview using wdl-aid:
    wdl-aid --strict -t scripts/docs_template.md.j2 pipeline.wdl > docs/inputs.md
  • Publish documentation (updateDocs.sh) from develop branch
    • Copy docs folder to gh-pages branch
    • Overwrite existing develop folder with docs folder on gh-pages
    • Push changes to gh-pages branch
  • Check latest documentation
    looks fine
  • Change current development version in CHANGELOG.md to stable version.
  • Run the release script release.sh
    • Check all submodules are tagged
    • Merge the develop branch into master
    • Created an annotated tag with the stable version number. Include changes
      from changelog.md.
    • Confirm or set stable version to be used for tagging
    • Push tag to remote.
    • Merge master branch back into develop.
    • Add updated version number to develop
  • Publish documentation (updateDocs.sh) from master branch
    • Copy docs folder to gh-pages branch
    • Rename docs to new stable version on gh-pages
    • Set latest version to new version
    • Push changes to gh-pages branch
  • Create a new release from the pushed tag on github

release 5.0.0

  • Check outstanding issues on JIRA and Github.
  • Update all submodules to latest master
    with: git submodule foreach "git checkout master;git pull; git submodule foreach --recursive 'git fetch'; git submodule update --init --recursive".
  • Check all submodules are tagged correctly with git submodule.
  • Run tests to confirm to be released version works.
  • Generate inputs overview using wdl-aid:
    wdl-aid --strict -t scripts/docs_template.md.j2 pipeline.wdl > docs/inputs.md
  • Publish documentation (updateDocs.sh) from develop branch.
    • Copy docs folder to gh-pages branch.
    • Overwrite existing develop folder with docs folder on gh-pages.
    • Push changes to gh-pages branch.
  • Check latest documentation looks fine.
  • Change current development version in CHANGELOG.md to stable version.
  • Run the release script release.sh.
    • Check all submodules are tagged.
    • Merge the develop branch into master.
    • Created an annotated tag with the stable version number.
      Include changes from changelog.md.
    • Confirm or set stable version to be used for tagging.
    • Push tag to remote.
    • Merge master branch back into develop.
    • Add updated version number to develop.
  • Publish documentation (updateDocs.sh) from master branch.
    • Copy docs folder to gh-pages branch.
    • Rename docs to new stable version on gh-pages.
    • Set latest version to new version.
    • Push changes to gh-pages branch.
  • Create a new release from the pushed tag on github.
  • Prepare the repo for packaging
    by git checkout master && git submodule update --init --recursive.
    • Package the wdl files with wdl-packager --reproducible -a LICENSE -a dockerImages.yml <WDL_FILE>.
    • Add the package(s) to the github release. Also add the original WDL
      file as <pipeline>_<version>.wdl following the same naming as the
      package.
      This alllows for usage of wdl and imports zip with cromwell without
      requiring the user to extract the package.

Tests

Hello, please how to test, the tests folder is given in the repository, but it seems that it is not stated in the docs。
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.