Comments (18)
Prepping a PR now..
from methylseq.
Ok, getting somewhere.. @IsmailM - this should definitely work. You're using Bismark with a standard iGenomes reference. GRCh37
has both a fasta and Bismark ref specified.
Looking at the code a little more closely, I think this is because these values are being set in nextflow.config
lines 16-19 and the iGenomes config is being loaded later in the same file, lines 100-103. So there are no iGenomes refs when the variable is set. @efratushava - your issue in #122 is due to the same problem I think.
In other pipelines, such as the RNAseq pipeline, those reference params are set in main.nf
instead for exactly this reason.
This used to be the case for the methylseq pipeline too, but it looks like they were moved by @phue in the massive template sync in #92 - so this change would have come out in version 1.4 of the pipeline and broken stuff. Which makes sense.
The fix should be to either use {}
to defer the execution of those config lines, or better still to just move them back to main.nf
. I think this is the best approach. It should be pretty easy and I think that this is a serious enough problem to warrant a patch release.
@phue - what do you think?
Phil
from methylseq.
I think that this is a serious enough problem to warrant a patch release.
Actually, given that we have merged a couple of PRs, I propose we do a v1.4.1 minor release.
from methylseq.
Please link to specific lines in the code (permalink with commit hash) if there are bits that you think are wrong.
lines 100-103 in main.nf, the lines that were added in the dev version
Have you pulled the latest version of the dev branch?
nextflow pull
yes, with git clone -b dev
from methylseq.
Hi @efratushava,
Good spot, I'm actually not sure.. I've never personally used the bwameth pipeline with Human data / iGenomes I think.
Ok, looking at https://github.com/brentp/bwa-meth#index - I think that the indices for bwameth are not the same as bwa. Running bwameth.py index
does call the regular bwa index
command, but it's done with a Fasta file that is converted to a 3-letter code.
So if we try to use regular bwa iGenomes indices for this analysis, it will fail. Or worse, work but give really terrible results.
Ideally, we should add bwa_meth indices to iGenomes for all species. But in the short term, it's possible that people will create their own config files with different species using this notation. So I'm happy to leave the code as it is I think.
Does this make sense? Let me know if you have any suggestions for code / documentation that could be improved.
Phil
from methylseq.
Maybe I was not clear.
The iGenome config file doesn't contin a bwa_meth field, so when running the pipeline with the bwameth aligner, it's looking tor the params.bwa_meth_index
which will always be false because params.genomes[ params.genome ].bwa_meth does not exist.
So, now, in the iGenome config, the bwa field contains a regular bwa index?
from methylseq.
Hi Phil,
I have run into the same issue (when running the bismark aligner option)
Perhaps, the config code could be updated to check for aligner type before attempting to set the bwa_meth_index
value (and look for the bwa_meth
key that does not exist by default)?
Moreover, there is no fasta_index
key in the igenomes config. Should we also set this to false.
from methylseq.
Hi both,
These config keys are not unique to iGenomes. We use the same configuration structure in our custom configs for home-made genome indices. This is why I think it's useful to have that code, even if these keys are not available in the iGenomes config.
I was under the impression that if your current pipeline config doesn't have a key called bwa_meth
then the channel will be set to false and the index will be built:
Line 17 in 40760ec
So if params.genomes[ params.genome ].bwa_meth
evaluates to false
(such as when it's not set), then bwa_meth_index
will be set to false
and the index will be built. Same for Bismark and fasta_index
.
Am I wrong about this - is the pipeline crashing for you?
So, now, in the iGenome config, the bwa field contains a regular bwa index?
Yes - used by other pipelines for regular genome alignments. A recent template change added all iGenomes index paths to all pipelines (previously these lines were absent from the pipeline).
Phil
from methylseq.
I initially saw the following error:
$ nextflow run nf-core/methylseq -profile docker -resume -c bismark.config --genome 'GRCh37'
N E X T F L O W ~ version 19.10.0
Launching `nf-core/methylseq` [grave_almeida] - revision: 40760ecffb [master]
No reference genome index or fasta file specified. Expression: (params.bismark_index || params.fasta)
-- Check script '/home/ucbtmog/.nextflow/assets/nf-core/methylseq/main.nf' at line: 109 or see '.nextflow.log' file for more details
Looking back at this, this seems to be less about the bwa_meth
/ fasta_index
keys not being present in the key, but rather bismark_index
and fasta
not being set automatically.
Setting the following manually in my bismark.config fixes the issue:
params {
...
bismark_index = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BismarkIndex/"
fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa"
...
}
from methylseq.
I also had this problem, issue #122
from methylseq.
still not working...
~% nextflow -log methylseq_try run nf-core/methylseq -with-singularity /tmp/ekushele-singularity/biscuit.img --aligner 'bismark' --reads 'data/fastq_files/*/*sorted_{1,2}.fastq' --genome 'GRCh37' --outdir 'data/nextflow'
N E X T F L O W ~ version 19.10.0
Launching `nf-core/methylseq` [shrivelled_golick] - revision: 40760ecffb [master]
Unknown config attribute `params.genomes.GRCh37.bwa_meth` -- check config file: /cs/icore/ekushele/.nextflow/assets/nf-core/methylseq/nextflow.config
What's up with this?
from methylseq.
It is not yet released, try with -r dev
from methylseq.
You for sure noticed, but I'm just emphasizing-the parameters in the main.nf file should start with params.fasta, params.fasta_index etc... (instead of fasta, fasta_index)
But somehow-it's still not working! The pipeline can't run on my files, which exist for sure!
The script made a link from the files I specified to the working directory, but the fastqc interupts everything, showing the error
Skipping '149122-cf-M.ben_list.b37.sorted_1.fastq' which didn't exist, or couldn't be read.
And the file exists!!!
from methylseq.
Please link to specific lines in the code (permalink with commit hash) if there are bits that you think are wrong.
Have you pulled the latest version of the dev branch? nextflow pull
from methylseq.
Though the file not existing sounds like it could be a similar problem to the MultiQC issue. I suspect that these are not pipeline specific but relate to containers / Nextflow interaction with the underlying compute infrastructure...
from methylseq.
Lines 100 to 103 in d0dffea
from methylseq.
This should now be fixed in dev
and will be released in 1.4.1 as a minor release.
Thanks all for spotting the error and helping with the fix 👍
from methylseq.
Hi all,
It took a long time sorry, but we finally just released v1.5 of the pipeline, which includes these fixes.
Apologies for the delay,
Phil
from methylseq.
Related Issues (20)
- methylseq is requesting very high compute resources I don't have. HOT 2
- test failure
- Update toolCitationText HOT 1
- Refactor kit processing to be maintainable
- nf-validation
- Pipeline unable to recognise samples processed across multiple lanes HOT 2
- bwameth index creation fails HOT 4
- Error when providing a bwameth index (MethylSeq v2.4.0 and v2.5.0)
- methylKit-compatible files are not copied to publishDir
- Parameters ignore_r1 for methylationextractor not expanded
- ERROR ~ fromIndex = -1 HOT 6
- Missing type field in nextflow_schema.json file
- Correct documentation to explain minimal inputs for bwa-meth subworkflow HOT 4
- Add bwameth test config HOT 3
- Scatter CI improvements
- --skip_qualimap parameter to disable qualimap
- Add CX report support from bismark_methylation_extractor HOT 7
- Add `asTair`
- Support for NuGEN Ovation RRBS HOT 2
- samplesheet input is not detecting entire values in sample column HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from methylseq.