Coder Social home page Coder Social logo

encode-dcc / wgbs-pipeline Goto Github PK

View Code? Open in Web Editor NEW
27.0 13.0 13.0 317.31 MB

ENCODE whole-genome bisulfite sequencing (WGBS) pipeline

License: MIT License

WDL 34.17% Python 61.90% Dockerfile 2.96% Shell 0.97%
bioinformatics wgbs methylation ngs pipeline

wgbs-pipeline's Introduction

wgbs-pipeline

CircleCI Code style: black MIT License

Overview

An ENCODE pipeline for processing whole-genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS) data using gemBS for alignment and methylation extraction.

Installation

  1. Git clone this pipeline.

    $ git clone https://github.com/ENCODE-DCC/wgbs-pipeline
  2. Install Caper, requires java >= 1.8, docker, and python >= 3.6 . Caper is a python wrapper for Cromwell.

    $ pip install caper  # use pip3 if it doesn't work
  3. Follow Caper's README carefully to configure it for your platform (local, cloud, cluster, etc.)

IMPORTANT: Configure your Caper configuration file ~/.caper/default.conf correctly for your platform.

Usage

To verify your installation, you can run the following pipeline with a test data set by invoking the following command from the root of the cloned repository.

Note: this will incur some cost when running in cloud environments.

$ caper run wgbs-pipeline.wdl -i tests/functional/json/test_wgbs.json --docker

For detailed usage, see usage

Inputs

See inputs

Outputs

See outputs

Contributing

We welcome comments, questions, suggestions, bug reports, feature requests, and pull requests (PRs). Please use one of the existing Github issue templates if applicable. When contributing code, please follow the Developer Guidelines.

wgbs-pipeline's People

Contributors

bek avatar paul-sud avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wgbs-pipeline's Issues

[bug] localization error when using Cromwell via Docker

This isn't an issue / bug proper, but rather just an issue that I can link in the documentation to explain the issue with using cromwell inside Docker. I'll open and close likely when I get it fleshed out fully:

Localization via hard link has failed: /cromwell-executions/wgbs/47d02aff-5371-478e-a412-54189e06b303/call-flatten_/inputs/-1474066501/flowcell_1_1_1.fastq.gz -> /opt/data/fastq/flowcell_1_1_1.fastq.gz: Invalid cross-device link

carry on! And thanks to @Bek for the help and pointing this out earlier!

File latency issue?

Describe the bug
The pipeline creates a file that it doesn't think exists. In the example below, one case of this would be /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/inputs/667055062/indexes.tar.gz, which does exist when I check after the fact.

This is running on a compute cluster. I've repeated this multiple times, so it's not just a one-off error.

My first guess it that this is due to file system latency. If this were Snakemake, I'd know that I just needed to increase the value passed to --latency-wait, but I'm not sure how to set something like that for this pipeline.

OS/Platform

  • OS/Platform: Ubuntu 20.04.6 LTS
  • Pipeline version: 1.1.7
  • Caper version: 2.3.2

Caper configuration file

backend=slurm

# SLURM partition. DEFINE ONLY IF REQUIRED BY YOUR CLUSTER'S POLICY.
# You must define it for Stanford Sherlock.
slurm-partition=serial

# SLURM account. DEFINE ONLY IF REQUIRED BY YOUR CLUSTER'S POLICY.
# You must define it for Stanford SCG.
slurm-account=

# Local directory for localized files and Cromwell's intermediate files.
# If not defined then Caper will make .caper_tmp/ on CWD or `local-out-dir`.
# /tmp is not recommended since Caper store localized data files here.
local-loc-dir=

cromwell=/redacted/caper/2.3.2/jars/cromwell-82.jar
womtool=/redacted/caper/2.3.2/jars/womtool-82.jar

Input JSON file

{
  "wgbs.benchmark_mode": true,
  "wgbs.extra_reference": "/redacted2/encode-wgbs/wgbs-pipeline/tests/data/conversion_control.fa.gz",
  "wgbs.fastqs": [
    [
      [
        "/redacted2/encode-wgbs/wgbs-pipeline/tests/data/sample5_data_1_200000.fastq.gz",
        "/redacted2/encode-wgbs/wgbs-pipeline/tests/data/sample5_data_2_200000.fastq.gz"
      ]
    ]
  ],
  "wgbs.reference": "/redacted2/encode-wgbs/wgbs-pipeline/tests/data/sacCer3.fa.gz",
  "wgbs.sample_names": [
    "sample5"
  ],
  "wgbs.underconversion_sequence_name": "NC_001416.1"
}

Error log

2023-10-04 21:10:58,310|caper.cli|INFO| Cromwell stdout: /redacted/wgbs_test/cromwell.out.1
2023-10-04 21:10:58,315|caper.caper_base|INFO| Creating a timestamped temporary directory. /redacted/wgbs_test/.caper_tmp/wgbs-pipeline/20231004_211058_313834
2023-10-04 21:10:58,315|caper.caper_runner|INFO| Localizing files on work_dir. /redacted/wgbs_test/.caper_tmp/wgbs-pipeline/20231004_211058_313834
2023-10-04 21:10:58,686|caper.caper_workflow_opts|INFO| Singularity image found in WDL metadata. wdl=/redacted2/encode-wgbs/1.1.8/wgbs-pipeline.wdl, s=docker://encodedcc/wgbs-pipeline:1.1.7
2023-10-04 21:10:58,706|caper.cromwell|INFO| Validating WDL/inputs/imports with Womtool...
2023-10-04 21:11:04,509|caper.nb_subproc_thread|INFO| Subprocess finished successfully.
2023-10-04 21:11:04,510|caper.cromwell|INFO| Passed Womtool validation.            
2023-10-04 21:11:04,510|caper.caper_runner|INFO| launching run: wdl=/redacted2/encode-wgbs/1.1.8/wgbs-pipeline.wdl, inputs=/redacted/wgbs_test/test_wgbs.json, backend_conf=/redacted/wgbs_test/.caper_tmp/wgbs-pipeline/20231004_211058_313834/backend.co
2023-10-04 21:11:15,543|caper.cromwell_workflow_monitor|INFO| Workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Submitted
2023-10-04 21:11:15,605|caper.cromwell_workflow_monitor|INFO| Workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Running
2023-10-04 21:11:23,864|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.make_conf:-1, retry=0, status=CallCached
2023-10-04 21:11:26,814|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.make_metadata_csv:-1, retry=0, status=CallCached
2023-10-04 21:11:29,834|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.index_reference:-1, retry=0, status=CallCached
2023-10-04 21:11:35,809|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.prepare:-1, retry=0, status=CallCached
2023-10-04 21:11:44,809|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=0, status=Started, job_id=2081286
2023-10-04 21:11:44,837|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=0, status=Running
2023-10-04 21:11:51,943|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=0, status=Done
2023-10-04 21:11:59,788|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=1, status=Started, job_id=2081287
2023-10-04 21:11:59,796|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=1, status=Running
2023-10-04 21:12:04,068|caper.cromwell_workflow_monitor|INFO| Task: id=91e49ae4-9226-4824-af45-301fc1a815e8, task=wgbs.map:0, retry=1, status=Done
2023-10-04 21:12:05,042|caper.cromwell_workflow_monitor|INFO| Workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Failed
2023-10-04 21:12:15,586|caper.cromwell_metadata|INFO| Wrote metadata file. /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/metadata.json
2023-10-04 21:12:15,587|caper.cromwell|INFO| Workflow failed. Auto-troubleshooting...
2023-10-04 21:12:15,589|caper.nb_subproc_thread|ERROR| Cromwell failed. returncode=1
2023-10-04 21:12:15,589|caper.cli|ERROR| Check stdout in /redacted/wgbs_test/cromwell.out.1
* Started troubleshooting workflow: id=91e49ae4-9226-4824-af45-301fc1a815e8, status=Failed
* Found failures JSON object.                                                      
[                                                                                  
    {                                                                              
        "causedBy": [                                                              
            {                                                                      
                "message": "Job wgbs.map:0:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.",
                "causedBy": []                                                     
            }                                                                      
        ],                                                                         
        "message": "Workflow failed"                                               
    }                                                                              
]                                                                                  
* Recursively finding failures in calls (tasks)...                                 
                                                                                   
==== NAME=wgbs.map, STATUS=RetryableFailure, PARENT=                               
SHARD_IDX=0, RC=1, JOB_ID=2081286                                                  
START=2023-10-04T21:11:41.051Z, END=2023-10-04T21:11:54.791Z                       
STDOUT=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/stdout
STDERR=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/stderr
STDERR_CONTENTS=                                                                   
tar: /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/inputs/667055062/indexes.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now                                         
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.bam': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.csi': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.bam.md5': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/execution/mapping/**/*.json': No such file or directory
                                                                                   
                                                                                   
==== NAME=wgbs.map, STATUS=Failed, PARENT=                                         
SHARD_IDX=0, RC=1, JOB_ID=2081287                                                  
START=2023-10-04T21:11:55.035Z, END=2023-10-04T21:12:04.072Z                       
STDOUT=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/stdout
STDERR=/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/stderr
STDERR_CONTENTS=                                                                   
tar: /redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/inputs/667055062/indexes.tar.gz: Cannot open: No such file or directory
tar: Error is not recoverable: exiting now                                         
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.bam': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.csi': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.bam.md5': No such file or directory
ln: failed to access '/redacted/wgbs_test/wgbs/91e49ae4-9226-4824-af45-301fc1a815e8/call-map/shard-0/attempt-2/execution/mapping/**/*.json': No such file or directory
                                                                                   

[question] difference between verison 1 and version 2 wdl?

I see two wdl pipeline files:

$ ls wgbs-pipeline*
wgbs-pipeline-v2.wdl  wgbs-pipeline.wdl

along with the test.wdl (which I believe is the one I should use) and some other ones:

$ ls *.wdl
bs-call.wdl  mapping.wdl  test.wdl  wgbs-pipeline-v2.wdl  wgbs-pipeline.wdl

How do I know what these are, and which to use? I think if there were two clearly "main" pipelines in the folder, one named according to the repo name (wgbs-pipeline.wdl) and the other for testing "test.wdl") this would be intuitive. Can we discuss these other files and how they are used?

Cannot find '__main__' module in '' when running installation test

Describe the bug
I was trying to install the ENCODE WGBS pipeline to analyze some Methyl-Seq data. Everything seemed fine until testing the installation with the command on the README:
caper run wgbs-pipeline.wdl -i tests/functional/json/test_wgbs.json

The pipeline runs and assigns a work ID but fails within the first or second steps giving the following error message:
/home/ubuntu/anaconda3/bin/python3: can't find '__main__' module in ''

Any insight on this would be greatly appreciated.

OS/Platform

  • OS/Platform: Ubuntu 16.04
  • Conda version: v4.9.2
  • Pipeline version: [e.g. v1.3.3] #GitHub project was cloned today 02/15/2021
  • Caper version: v1.4.2

Caper configuration file

backend=local

# Hashing strategy for call-caching (3 choices)
# This parameter is for local (local/slurm/sge/pbs) backend only.
# This is important for call-caching,
# which means re-using outputs from previous/failed workflows.
# Cache will miss if different strategy is used.
# "file" method has been default for all old versions of Caper<1.0.
# "path+modtime" is a new default for Caper>=1.0,
#   file: use md5sum hash (slow).
#   path: use path.
#   path+modtime: use path and modification time.
local-hash-strat=path+modtime

# Local directory for localized files and Cromwell's intermediate files
# If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.
# /tmp is not recommended here since Caper store all localized data files
# on this directory (e.g. input FASTQs defined as URLs in input JSON).
local-loc-dir=

cromwell=/home/ubuntu/.caper/cromwell_jar/cromwell-52.jar
womtool=/home/ubuntu/.caper/womtool_jar/womtool-52.jar

Input JSON file
Contents of the test json: test_wgbs.json

{
  "wgbs.benchmark_mode": true,
  "wgbs.extra_reference": "tests/data/conversion_control.fa.gz",
  "wgbs.fastqs": [
    [
      [
        "tests/data/sample5_data_1_200000.fastq.gz",
        "tests/data/sample5_data_2_200000.fastq.gz"
      ]
    ]
  ],
  "wgbs.reference": "tests/data/sacCer3.fa.gz",
  "wgbs.sample_names": [
    "sample5"
  ],
  "wgbs.underconversion_sequence_name": "NC_001416.1"
}

Error log
Caper automatically runs a troubleshooter for failed workflows. If it doesn't then get a WORKFLOW_ID of your failed workflow with caper list or directly use a metadata.json file on Caper's output directory.

$ caper debug [WORKFLOW_ID_OR_METADATA_JSON_FILE]

Output of caper debug metadata.json on the failed run

* Started troubleshooting workflow: id=6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41, status=Failed
* Found failures JSON object.
[
    {
        "message": "Workflow failed",
        "causedBy": [
            {
                "causedBy": [],
                "message": "Job wgbs.make_conf:NA:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
            },
            {
                "causedBy": [],
                "message": "Job wgbs.make_metadata_csv:NA:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details."
            }
        ]
    }
]
* Recursively finding failures in calls (tasks)...

==== NAME=wgbs.make_conf, STATUS=RetryableFailure, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=45272
START=2021-02-15T20:57:45.333Z, END=2021-02-15T20:57:58.862Z
STDOUT=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_conf/execution/stdout
STDERR=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_conf/execution/stderr
STDERR_CONTENTS=
/home/ubuntu/anaconda3/bin/python3: can't find '__main__' module in ''


==== NAME=wgbs.make_conf, STATUS=Failed, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=45345
START=2021-02-15T20:58:01.217Z, END=2021-02-15T20:58:14.905Z
STDOUT=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_conf/attempt-2/execution/stdout
STDERR=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_conf/attempt-2/execution/stderr
STDERR_CONTENTS=
/home/ubuntu/anaconda3/bin/python3: can't find '__main__' module in ''


==== NAME=wgbs.make_metadata_csv, STATUS=RetryableFailure, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=45292
START=2021-02-15T20:57:47.212Z, END=2021-02-15T20:57:58.862Z
STDOUT=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_metadata_csv/execution/stdout
STDERR=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_metadata_csv/execution/stderr
STDERR_CONTENTS=
/home/ubuntu/anaconda3/bin/python3: can't find '__main__' module in ''


==== NAME=wgbs.make_metadata_csv, STATUS=Failed, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=45364
START=2021-02-15T20:58:03.205Z, END=2021-02-15T20:58:17.405Z
STDOUT=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_metadata_csv/attempt-2/execution/stdout
STDERR=/home/ubuntu/20210215_IPI_MethylSeq_TestRun/software/wgbs-pipeline/wgbs/6ce87d1d-27ca-4f38-8fbd-bec0ac88cc41/call-make_metadata_csv/attempt-2/execution/stderr
STDERR_CONTENTS=
/home/ubuntu/anaconda3/bin/python3: can't find '__main__' module in ''

TypeError in wgbs.map Step

Describe the bug
At the wgbs.map step, I get a TypeError:

==== NAME=wgbs.map, STATUS=Failed, PARENT=
SHARD_IDX=1, RC=1, JOB_ID=9608
START=2021-10-15T03:04:21.289Z, END=2021-10-15T03:09:23.663Z
STDOUT=/resource3/data/WGBS/Processed_Caper/wgbs/9f04e4dd-2a84-4af6-8731-d3c49f6e2782/call-map/shard-1/attempt-2/execution/stdout
STDERR=/resource3/data/WGBS/Processed_Caper/wgbs/9f04e4dd-2a84-4af6-8731-d3c49f6e2782/call-map/shard-1/attempt-2/execution/stderr
STDERR_CONTENTS=
:
: Command map started at 2021-10-14 20:08:16.349606
:
: ------------ Mapping Parameters ------------
: Sample barcode : sample_1
: Data set : 1
: No. threads : 8
: Index : indexes/hg38.BS.gem
: Paired : False
: Read non stranded: False
: Type : SINGLE
: Input Files : ./fastq/1/Control_S1_L004_R2_001.fastq.gz
: Output dir : ./mapping/sample_1
:
: Bisulfite Mapping...
TypeError: sequence item 14: expected str instance, NoneType found
ln: failed to access '/resource3/data/WGBS/Processed_Caper/wgbs/9f04e4dd-2a84-4af6-8731-d3c49f6e2782/call-map/shard-1/attempt-2/execution/mapping//*.bam': No such file or directory
ln: failed to access '/resource3/data/WGBS/Processed_Caper/wgbs/9f04e4dd-2a84-4af6-8731-d3c49f6e2782/call-map/shard-1/attempt-2/execution/mapping/
/.csi': No such file or directory
ln: failed to access '/resource3/data/WGBS/Processed_Caper/wgbs/9f04e4dd-2a84-4af6-8731-d3c49f6e2782/call-map/shard-1/attempt-2/execution/mapping/**/
.bam.md5': No such file or directory
ln: failed to access '/resource3/data/WGBS/Processed_Caper/wgbs/9f04e4dd-2a84-4af6-8731-d3c49f6e2782/call-map/shard-1/attempt-2/execution/mapping/**/*.json': No such file or directory

How can I resolve this error?

OS/Platform

  • OS/Platform: Linux 7 CentOS on a Slurm-managed HPC
  • Conda version: conda 4.10.3
  • Pipeline version: 1.1.7
  • Caper version: 1.6.3

Caper configuration file
default.conf.txt

Error log
Caper automatically runs a troubleshooter for failed workflows. If it doesn't then get a WORKFLOW_ID of your failed workflow with caper list or directly use a metadata.json file on Caper's output directory.

$ caper debug [WORKFLOW_ID_OR_METADATA_JSON_FILE]

cromwell.out.txt

Input JSON File
json_input.txt

used software of your workflow

Thanks for your contribution, I want to refer to your process to implement it step by step, but I can't find the detailed use method, software and workflow,such as smtools、bismark、or piard???

Make sure to check that your docker daemon is running before trying to run the pipeline or it will fail

Describe the bug
Hello, I'm trying to bring the ENCODE pipelines to my school but I'm working my way through learning how to use them. I'm getting a better hang of understanding what to do with .wdl and input json files.
I've been playing around with cloning the wgbs-pipeline and running the sample code as described in the readme file which was working last week. However, this week I recently started getting an error message when I tried running the pipeline from scratch.

Here is the code I ran
$ git clone https://github.com/ENCODE-DCC/wgbs-pipeline.git
$ pip3 install caper
$ caper run wgbs-pipeline.wdl -i tests/functional/json/test_wgbs.json --docker

The caper failure id is 0b655180-c147-4524-bed5-92d854487050
I tried running the debug on that error code but had an issue connecting to the caper server.

OS/Platform

  • OS/Platform: Ubuntu 20.04.2 LTS,
  • Conda version: 4.10.3
  • Pipeline version: v1.1.7
  • Caper version: 2.1.1

Caper configuration file

caper_config
Input JSON file

  • name: test_wgbs
    tags:
    • functional
      command: >-
      tests/caper_run.sh
      wgbs-pipeline.wdl
      tests/functional/json/test_wgbs.json
      files:
    • path: test-output/gembs.conf
      md5sum: 1ad8f25544fa7dcb56383bc233407b54
    • path: test-output/gembs_metadata.csv
      md5sum: 9dd5d3bee6e37ae7dbbf4a29edd0ed3f
    • path: test-output/indexes.tar.gz
      md5sum: bde2c7f6984c318423cb7354c4c37d44
    • path: test-output/gemBS.json
      md5sum: d6ef6f4d2ee7e4c3d2e8c439bb2cb618
    • path: test-output/glob-e97d885c83d966247d485dc62b6ae799/sacCer3.contig.sizes
      md5sum: 0497066e3880c6932cf6bde815c42c40
    • path: test-output/glob-c8599c0b9048b55a8d5cfaad52995a94/sample_sample5.bam
      md5sum: fc1d87ed4f9f7dab78f58147c02d06c9
    • path: test-output/glob-e42c489c9c1355a3e5eca0071600f795/sample_sample5.bam.csi
      md5sum: b37be1c10623f32bbe73c364325754b0
    • path: test-output/glob-13824b1e03fdcf315fe2424593870e56/sample_sample5.bam.md5
      md5sum: 56f31539029eab274ff0ac03e84e214a
    • path: test-output/glob-c0e92e4e9fb050e7e70bb645748b45dd/0.json
      md5sum: cbf4ba8d84384779c626b298a9a60b96
    • path: test-output/glob-1e6c456aecc092f75370b54a5806588f/sample_sample5.bcf
      md5sum: 7cafb436b89898e852f971f1f3b20fc6
    • path: test-output/glob-804650e4b0c9cc57f1bbc0b3919d1f73/sample_sample5.bcf.csi
      md5sum: 156c39bd2bcc0bc83052eb4171f83507
    • path: test-output/glob-95d24e89d025dc63935acc9ded9f8810/sample_sample5.bcf.md5
      md5sum: 8ebe942fe48b07e1a3455f572aadf57b
    • path: test-output/glob-0b0236659b9524643e6454061959b28c/sample_sample5_pos.bw
      md5sum: c30bc10a258ca4f1fe67f115c4c2db10
    • path: test-output/glob-041e1709c7dd1f320426281eb4649f9b/sample_sample5_neg.bw
      md5sum: f49ab06a51d9c4a8e663f0472e70eb06
    • path: test-output/glob-708835e6a0042d33b00b6937266734f5/sample_sample5_chg.bb
      md5sum: cc123bff807e0637864d387628d410fa
    • path: test-output/glob-f70a6609728d4fb1448dba1f41361a30/sample_sample5_cpg.bb
      md5sum: f67f273c68577197ecf19a8bb92c925c
    • path: test-output/glob-52f916d7cc14a5bcfb168d6910e04b56/sample_sample5_chg.bed.gz
      md5sum: 6994fafbf9eab44ff6e7fafa421fffbc
    • path: test-output/glob-2b5148d6967b43eee33eb370fd36b70e/sample_sample5_chh.bed.gz
      md5sum: 31a69396b7084520c04bd80f2cabfd59
    • path: test-output/glob-f31cca1fcab505e10c2fe5ff003b211a/sample_sample5_cpg.bed.gz
      md5sum: 67193e21cc34b76d12efba1a19df6644
    • path: test-output/glob-b76cddd256e1197e0b726acc7184afc4/sample_sample5_cpg.txt.gz
      md5sum: fccd9c9c5b4fea80890abd536fd76a35
    • path: test-output/glob-24ceb385eea2ca53f1e6c4a1438ccd21/sample_sample5_cpg.txt.gz.tbi
      md5sum: a1e08686f568af353e9026c1de00c25d
    • path: test-output/glob-40c90aa4516b00209d682b819b1d021f/sample_sample5_non_cpg.txt.gz
      md5sum: 6809ee8479439454aa502ae11f48d91c
    • path: test-output/glob-664ff83c3881df2363da923f006b098b/sample_sample5_non_cpg.txt.gz.tbi md5sum: 6cdebb4ad2ea184ca4783acb350ae038
    • path: test-output/gembs_map_qc.json
      md5sum: 26b5238ab7bb5b195d1cf8127767261c
    • path: test-output/gembs_map_qc.json
      md5sum: 26b5238ab7bb5b195d1cf8127767261c
    • path: test-output/glob-65c481a690a62b639d918bb70927f25e/sample_sample5.isize.png
      md5sum: f0277a185298dee7156ec927b02466c7
    • path: test-output/mapping_reports/mapping/sample_sample5.mapq.png
      md5sum: 49837be15f24f23c59c50241cf504614
    • path: test-output/glob-1aeed469ae5d1e8d7cbca51e8758b781/ENCODE.html
      md5sum: 163401e0bf6a377c2a35dc4bf9064574
    • path: test-output/glob-1aeed469ae5d1e8d7cbca51e8758b781/0.html
      md5sum: 604b6f1a4c641b7d308a4766d97cadb7
    • path: test-output/glob-1aeed469ae5d1e8d7cbca51e8758b781/sample_sample5.html
      md5sum: b86ce9e15ab8e1e9c45f467120c22649
    • path: test-output/glob-1aeed469ae5d1e8d7cbca51e8758b781/0.isize.png
      md5sum: 82a262d0bf2dadb02239272da490bba1
    • path: test-output/glob-1aeed469ae5d1e8d7cbca51e8758b781/0.mapq.png
      md5sum: d8c3af2eae5af12f1eb5dd9ec4e225bb
    • path: test-output/glob-1aeed469ae5d1e8d7cbca51e8758b781/style.css
      md5sum: a09ae01f70fa6d2461e37d5814ceb579
    • path: test-output/coverage.bw
      md5sum: afa224c2037829dccacea4a67b6fa84a
    • path: test-output/average_coverage_qc.json
      md5sum: 82ce31e21d361d52a7f19dce1988b827
    • path: test-output/bed_pearson_correlation_qc.json
      should_exist: false

Error log
caper_error_0b655180-c147-4524-bed5-92d854487050_terminal

$ caper debug [WORKFLOW_ID_OR_METADATA_JSON_FILE]

caper_error_0b655180-c147-4524-bed5-92d854487050

Thank you so much!
Best,
Jake Lehle

failed with test data

OS/Platform

  • OS/Platform: Ubuntu 20.04.3 LTS
  • Conda version: conda 4.11.0
  • Pipeline version: v1.1.8
  • Caper version: v2.1.3

Caper configuration file

backend=local

# Hashing strategy for call-caching (3 choices)
# This parameter is for local (local/slurm/sge/pbs/lsf) backend only.
# This is important for call-caching,
# which means re-using outputs from previous/failed workflows.
# Cache will miss if different strategy is used.
# "file" method has been default for all old versions of Caper<1.0.
# "path+modtime" is a new default for Caper>=1.0,
#   file: use md5sum hash (slow).
#   path: use path.
#   path+modtime: use path and modification time.
local-hash-strat=path+modtime

# Metadata DB for call-caching (reusing previous outputs):
# Cromwell supports restarting workflows based on a metadata DB
# DB is in-memory by default
#db=in-memory

# If you use 'caper server' then you can use one unified '--file-db'
# for all submitted workflows. In such case, uncomment the following two lines
# and defined file-db as an absolute path to store metadata of all workflows
#db=file
#file-db=

# If you use 'caper run' and want to use call-caching:
# Make sure to define different 'caper run ... --db file --file-db DB_PATH'
# for each pipeline run.
# But if you want to restart then define the same '--db file --file-db DB_PATH'
# then Caper will collect/re-use previous outputs without running the same task again
# Previous outputs will be simply hard/soft-linked.


# Local directory for localized files and Cromwell's intermediate files
# If not defined, Caper will make .caper_tmp/ on local-out-dir or CWD.
# /tmp is not recommended here since Caper store all localized data files
# on this directory (e.g. input FASTQs defined as URLs in input JSON).
local-loc-dir=/mnt/storage/hong/caper

cromwell=/home/hong/.caper/cromwell_jar/cromwell-65.jar
womtool=/home/hong/.caper/womtool_jar/womtool-65.jar

Input JSON file

{
  "wgbs.benchmark_mode": true,
  "wgbs.extra_reference": "tests/data/conversion_control.fa.gz",
  "wgbs.fastqs": [
    [
      [
        "tests/data/sample5_data_1_200000.fastq.gz",
        "tests/data/sample5_data_2_200000.fastq.gz"
      ]
    ]
  ],
  "wgbs.reference": "tests/data/sacCer3.fa.gz",
  "wgbs.sample_names": [
    "sample5"
  ],
  "wgbs.underconversion_sequence_name": "NC_001416.1"
}

Error log
Caper automatically runs a troubleshooter for failed workflows. If it doesn't then get a WORKFLOW_ID of your failed workflow with caper list or directly use a metadata.json file on Caper's output directory.

* Found failures JSON object.
[
    {
        "message": "Workflow failed",
        "causedBy": [
            {
                "message": "Job wgbs.make_conf:NA:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.",
                "causedBy": []
            },
            {
                "message": "Job wgbs.make_metadata_csv:NA:2 exited with return code 1 which has not been declared as a valid return code. See 'continueOnReturnCode' runtime attribute for more details.",
                "causedBy": []
            }
        ]
    }
]
* Recursively finding failures in calls (tasks)...

==== NAME=wgbs.make_conf, STATUS=RetryableFailure, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=588529
START=2022-02-24T14:20:39.912Z, END=2022-02-24T14:20:52.057Z
STDOUT=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/execution/stdout
STDERR=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/execution/stderr
STDERR_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/execution/'

STDERR_BACKGROUND_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/execution/'



==== NAME=wgbs.make_conf, STATUS=Failed, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=592187
START=2022-02-24T14:20:54.103Z, END=2022-02-24T14:21:01.920Z
STDOUT=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/attempt-2/execution/stdout
STDERR=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/attempt-2/execution/stderr
STDERR_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/attempt-2/execution/'

STDERR_BACKGROUND_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_conf/attempt-2/execution/'



==== NAME=wgbs.make_metadata_csv, STATUS=RetryableFailure, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=588545
START=2022-02-24T14:20:40.106Z, END=2022-02-24T14:20:52.057Z
STDOUT=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/execution/stdout
STDERR=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/execution/stderr
STDERR_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/execution/'

STDERR_BACKGROUND_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/execution/'



==== NAME=wgbs.make_metadata_csv, STATUS=Failed, PARENT=
SHARD_IDX=-1, RC=1, JOB_ID=593365
START=2022-02-24T14:20:56.102Z, END=2022-02-24T14:21:03.969Z
STDOUT=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/attempt-2/execution/stdout
STDERR=/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/attempt-2/execution/stderr
STDERR_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/attempt-2/execution/'

STDERR_BACKGROUND_CONTENTS=
/home/hong/anaconda3/envs/wgbs/bin/python3: can't find '__main__' module in '/home/hong/wgbs-pipeline/wgbs/9cd9270a-b1e7-45cb-a706-260cd3685f1c/call-make_metadata_csv/attempt-2/execution/'


2022-02-24 15:21:17,695|caper.nb_subproc_thread|ERROR| Cromwell failed. returncode=1
2022-02-24 15:21:17,695|caper.cli|ERROR| Check stdout in /home/hong/wgbs-pipeline/cromwell.out.3

Update to gemBS v3

Hello,

As far as I can tell you created this pipeline based on gemBS v2. Are you planning to update it for gemBS v3?

Best,
Bekir

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.