Coder Social home page Coder Social logo

Comments (18)

ewels avatar ewels commented on July 18, 2024 4

Prepping a PR now..

from methylseq.

ewels avatar ewels commented on July 18, 2024 1

Ok, getting somewhere.. @IsmailM - this should definitely work. You're using Bismark with a standard iGenomes reference. GRCh37 has both a fasta and Bismark ref specified.

Looking at the code a little more closely, I think this is because these values are being set in nextflow.config lines 16-19 and the iGenomes config is being loaded later in the same file, lines 100-103. So there are no iGenomes refs when the variable is set. @efratushava - your issue in #122 is due to the same problem I think.

In other pipelines, such as the RNAseq pipeline, those reference params are set in main.nf instead for exactly this reason.

This used to be the case for the methylseq pipeline too, but it looks like they were moved by @phue in the massive template sync in #92 - so this change would have come out in version 1.4 of the pipeline and broken stuff. Which makes sense.

The fix should be to either use {} to defer the execution of those config lines, or better still to just move them back to main.nf. I think this is the best approach. It should be pretty easy and I think that this is a serious enough problem to warrant a patch release.

@phue - what do you think?

Phil

from methylseq.

ewels avatar ewels commented on July 18, 2024 1

I think that this is a serious enough problem to warrant a patch release.

Actually, given that we have merged a couple of PRs, I propose we do a v1.4.1 minor release.

from methylseq.

efratushava avatar efratushava commented on July 18, 2024 1

Please link to specific lines in the code (permalink with commit hash) if there are bits that you think are wrong.

lines 100-103 in main.nf, the lines that were added in the dev version

Have you pulled the latest version of the dev branch? nextflow pull

yes, with git clone -b dev

from methylseq.

ewels avatar ewels commented on July 18, 2024

Hi @efratushava,

Good spot, I'm actually not sure.. I've never personally used the bwameth pipeline with Human data / iGenomes I think.

Ok, looking at https://github.com/brentp/bwa-meth#index - I think that the indices for bwameth are not the same as bwa. Running bwameth.py index does call the regular bwa index command, but it's done with a Fasta file that is converted to a 3-letter code.

So if we try to use regular bwa iGenomes indices for this analysis, it will fail. Or worse, work but give really terrible results.

Ideally, we should add bwa_meth indices to iGenomes for all species. But in the short term, it's possible that people will create their own config files with different species using this notation. So I'm happy to leave the code as it is I think.

Does this make sense? Let me know if you have any suggestions for code / documentation that could be improved.

Phil

from methylseq.

efratushava avatar efratushava commented on July 18, 2024

Maybe I was not clear.
The iGenome config file doesn't contin a bwa_meth field, so when running the pipeline with the bwameth aligner, it's looking tor the params.bwa_meth_index which will always be false because params.genomes[ params.genome ].bwa_meth does not exist.

So, now, in the iGenome config, the bwa field contains a regular bwa index?

from methylseq.

IsmailM avatar IsmailM commented on July 18, 2024

Hi Phil,

I have run into the same issue (when running the bismark aligner option)

Perhaps, the config code could be updated to check for aligner type before attempting to set the bwa_meth_index value (and look for the bwa_meth key that does not exist by default)?

Moreover, there is no fasta_index key in the igenomes config. Should we also set this to false.

from methylseq.

ewels avatar ewels commented on July 18, 2024

Hi both,

These config keys are not unique to iGenomes. We use the same configuration structure in our custom configs for home-made genome indices. This is why I think it's useful to have that code, even if these keys are not available in the iGenomes config.

I was under the impression that if your current pipeline config doesn't have a key called bwa_meth then the channel will be set to false and the index will be built:

bwa_meth_index = params.genome ? params.genomes[ params.genome ].bwa_meth ?: false : false

So if params.genomes[ params.genome ].bwa_meth evaluates to false (such as when it's not set), then bwa_meth_index will be set to false and the index will be built. Same for Bismark and fasta_index.

Am I wrong about this - is the pipeline crashing for you?

So, now, in the iGenome config, the bwa field contains a regular bwa index?

Yes - used by other pipelines for regular genome alignments. A recent template change added all iGenomes index paths to all pipelines (previously these lines were absent from the pipeline).

Phil

from methylseq.

IsmailM avatar IsmailM commented on July 18, 2024

I initially saw the following error:

$ nextflow run nf-core/methylseq -profile docker -resume -c bismark.config --genome 'GRCh37' 
N E X T F L O W  ~  version 19.10.0
Launching `nf-core/methylseq` [grave_almeida] - revision: 40760ecffb [master]
No reference genome index or fasta file specified. Expression: (params.bismark_index || params.fasta)

-- Check script '/home/ucbtmog/.nextflow/assets/nf-core/methylseq/main.nf' at line: 109 or see '.nextflow.log' file for more details

Looking back at this, this seems to be less about the bwa_meth / fasta_index keys not being present in the key, but rather bismark_index and fasta not being set automatically.

Setting the following manually in my bismark.config fixes the issue:

params {
  ...
  bismark_index = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BismarkIndex/"
  fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa"
  ...
}

from methylseq.

efratushava avatar efratushava commented on July 18, 2024

I also had this problem, issue #122

from methylseq.

efratushava avatar efratushava commented on July 18, 2024

still not working...

~% nextflow -log methylseq_try run nf-core/methylseq -with-singularity /tmp/ekushele-singularity/biscuit.img --aligner 'bismark' --reads 'data/fastq_files/*/*sorted_{1,2}.fastq' --genome 'GRCh37' --outdir 'data/nextflow'  
N E X T F L O W  ~  version 19.10.0
Launching `nf-core/methylseq` [shrivelled_golick] - revision: 40760ecffb [master]
Unknown config attribute `params.genomes.GRCh37.bwa_meth` -- check config file: /cs/icore/ekushele/.nextflow/assets/nf-core/methylseq/nextflow.config

What's up with this?

from methylseq.

ewels avatar ewels commented on July 18, 2024

It is not yet released, try with -r dev

from methylseq.

efratushava avatar efratushava commented on July 18, 2024

You for sure noticed, but I'm just emphasizing-the parameters in the main.nf file should start with params.fasta, params.fasta_index etc... (instead of fasta, fasta_index)

But somehow-it's still not working! The pipeline can't run on my files, which exist for sure!
The script made a link from the files I specified to the working directory, but the fastqc interupts everything, showing the error
Skipping '149122-cf-M.ben_list.b37.sorted_1.fastq' which didn't exist, or couldn't be read.
And the file exists!!!

from methylseq.

ewels avatar ewels commented on July 18, 2024

Please link to specific lines in the code (permalink with commit hash) if there are bits that you think are wrong.

Have you pulled the latest version of the dev branch? nextflow pull

from methylseq.

ewels avatar ewels commented on July 18, 2024

Though the file not existing sounds like it could be a similar problem to the MultiQC issue. I suspect that these are not pipeline specific but relate to containers / Nextflow interaction with the underlying compute infrastructure...

from methylseq.

ewels avatar ewels commented on July 18, 2024

methylseq/main.nf

Lines 100 to 103 in d0dffea

bismark_index = params.genome ? params.genomes[ params.genome ].bismark ?: false : false
bwa_meth_index = params.genome ? params.genomes[ params.genome ].bwa_meth ?: false : false
fasta = params.genome ? params.genomes[ params.genome ].fasta ?: false : false
fasta_index = params.genome ? params.genomes[ params.genome ].fasta_index ?: false : false

from methylseq.

ewels avatar ewels commented on July 18, 2024

This should now be fixed in dev and will be released in 1.4.1 as a minor release.

Thanks all for spotting the error and helping with the fix 👍

from methylseq.

ewels avatar ewels commented on July 18, 2024

Hi all,

It took a long time sorry, but we finally just released v1.5 of the pipeline, which includes these fixes.

Apologies for the delay,

Phil

from methylseq.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.