Rapid-CNS² workflow

Overview

The Rapid-CNS² nextflow pipeline is a bioinformatics workflow designed for comprehensive analysis of genomic and epigenomic data generated using adaptive sampling based sequencing of central nervous system (CNS) tumours. It performs tasks such as basecalling, variant calling, methylation analysis, structural variant calling, copy number variation calling, and provides a comprehensive molecular diagnostic-ready report.

This pipeline is implemented using Nextflow, allowing for easy execution and scalability on various compute environments, including local machines, clusters, and cloud platforms.

Features

Modular architecture for easy customization and extension.
Supports both basecalling from raw ONT POD5s and analysis of pre-aligned BAM files.
Accelerated variant calling with Clara Parabricks supported Deepvariant and Sniffles2
Annotation and filtering of clinically relevant variants
Includes methylation analysis with Rapid-CNS² classifier and MGMT promoter methylation status determination.
Automated report generation for summarizing analysis results.
Prepare input files for the MNP-Flex classifier (optional)

Requirements

Nextflow (version 3.0.0 or later)
Conda, Docker or Singularity (optional, for containerized execution of tools)
Required input data:
- Raw ONT POD5 data (for basecalling) or pre-aligned BAM files
- Reference genome file (hg38 required)

Usage

Clone this repository:

git clone https://github.com/areebapatel/Rapid-CNS2_nf.git

Edit the nextflow.config file to configure pipeline parameters according to your requirements.
Run the pipeline using Nextflow:
```
nextflow run main.nf --input <input_directory> --id <sample_identifier> [--options]
```
Replace <input_directory> with the path to the directory containing ONT POD5 data or pre-aligned BAM files, and <sample_identifier> with a unique identifier for the sample.

Additional options can be specified to customize pipeline behavior. Use the --help option to view available options and their descriptions.
Monitor pipeline progress and access results in the specified output directory.

Sequencing

This pipeline analyses CNS tumour data generated through Nanopore adaptive sampling using ReadFish or adaptive sampling on MinKNOW. It is compatible with data generated on MinION, GridION and PromethION

Parameters

Parameter	Description	Default Value
`--input`	Path to the directory POD5 files for Dorado basecalling and minimap2 alignment or BAM file if available	(Required)
`--id`	Sample identifier	(Required)
`--ref`	Path to hg19 reference file	`null`
`--tmp_dir`	Directory to store temporary files. If it does not exist it will be created	`tempDir`
`--out_dir`	Directory path to store all the outputs	`output`
`--log_dir`	Directory to store log files	`logDir`
`--minimum_mgmt_cov`	Minimum coverage for MGMT promoter methylation analysis	`5`
`--model_config`	Basecalling model to be used	`[email protected]`
`--port`	Port for basecall server	`8887`
`--num_gpu`	Number of GPUs to use	`3`
`--num_clients`	Number of clients	`par.num_gpu * 3`
`--help`	Show help message	`null`
`--test`	Run in test mode	`null`
`--reads`	Samtools addreplacerg -r option. Specify as `-r "SM:GM24385" -r "ID:GM24385"`	`null`
`--basecalling`	Enable basecalling from raw ONT POD5 data. If provided, `--input` should point to the directory containing raw data.	`false`
`--mnp-flex`	Prepare input file for the MNP-Flex classifier.	`false`

Acknowledgements

We are extremely grateful to all our lab members and collaborators for their support! Keeping up with AI to make our life easier and to compensate for our (Areeba's) art skills, our logo was generated by DALL-E.

Citation

If you use this pipeline, please cite our preprint:

Felix Sahm, Areeba Patel, Kirsten Göbel et al. Versatile, accessible cross-platform molecular profiling of central nervous system tumors: web-based, prospective multi-center validation, 10 April 2024, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-4182910/v1]

Contributions

Contributions are welcome! If you encounter any issues, have suggestions for improvements, or would like to contribute new features, please open an issue or pull request on this repository.

License

This project is licensed under the MIT License.

MISSING ```.nf``` file report rendering and ```mnpFlex.nf```

Dear @areebapatel ,

As i was waiting for a response for the first issue, I tried deploying the pipeline in a local machine. I did come across alot of errors that are more of bugs concerning the variables in calling the processes in main.nf file vs how they are actually named in the respective ./nextflow/*.nf files. Kindly look into it to make the necessary updates.
Additionally, there is a problem in the basecalling.nf, the runtime: keyword in

runtime:
    docker pod5Docker
    cpus runtime_attributes.nThreads
    memory "${runtime_attributes.gbRAM} GB"
    time 10.hours

keeps being flagged as invalid, so i replaced : with {} brackets as shown:

runtime {
    docker pod5Docker
    cpus runtime_attributes.nThreads
    memory "${runtime_attributes.gbRAM} GB"
    time 10.hours
    }

Subsequently, the whole reportRendering.nf file in include { reportRendering } from './nextflow/reportRendering.nf' is missing from the .nextflow directory.From there, I could not proceed anymore. I request the addition of the reportRendering.nf file so I can proceed.

Thanks,
George

areebapatel / rapid-cns2_nf Goto Github PK

rapid-cns2_nf's Introduction

Rapid-CNS2 workflow

Overview

Features

Requirements

Usage

Sequencing

Parameters

Acknowledgements

Citation

Contributions

License

rapid-cns2_nf's People

Watchers

Forkers

rapid-cns2_nf's Issues

Recommend Projects

Recommend Topics

Recommend Org

Rapid-CNS² workflow