Coder Social home page Coder Social logo

Comments (15)

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024 1

After fixing a bug in my calc_manta_vaf script, it can now handle empty VCF files. So, there's no need for me to exclude empty VCF file from BEDPE processing. Note that svtools produces no output at all (rather than a header-only file), which might confuse downstream rules, but I think it's better to have an empty file than no file at all.

from lcr-modules.

lkhilton avatar lkhilton commented on August 30, 2024

Tagging @BrunoGrandePhD

from lcr-modules.

lkhilton avatar lkhilton commented on August 30, 2024

I'm realizing this may be a feature/bug of the configManta.py part of Manta because the bam file is effectively ignored by the _manta_run rule. Perhaps this is fixable by using the sample_bam as input to _manta_run even though it doesn't actually use bam file directly.

from lcr-modules.

rdmorin avatar rdmorin commented on August 30, 2024

I suspect this has something to do with the checkpoints and a disconnect (i.e. variable mapping of file name) between what files are produced and what the inputs are.

from lcr-modules.

lkhilton avatar lkhilton commented on August 30, 2024

I'm also having trouble getting the manta module to complete bedpe generation and symlinking. Here you can see 103 rnaSV/*.vcf.gz files, but only 98 bedpes.

lhilton@n105 /p/d/C/transcriptomes ❯❯❯ ls results/manta-1.0/03-bedpe/mrna--grch37/*/*.bedpe | wc -l                
98
hilton@n105 /p/d/C/transcriptomes ❯❯❯ ls results/manta-1.0/99-outputs/bedpe/mrna--grch37/rnaSV/*.bedpe | wc -l    
98
lhilton@n105 /p/d/C/transcriptomes ❯❯❯ ls results/manta-1.0/99-outputs/vcf/mrna--grch37/rnaSV/*.vcf.gz | wc -l     
103

The output of snakemake -np _manta_all is as follows:

Job counts:
        count   jobs
        1       _manta_all
        1       _manta_all_dispatch
        1       _manta_run
        3

I have one Manta job that keeps erroring out, so I expect one Manta run. However I also expected this to show jobs relating to bedpe generation. Is the issue here that the dispatch file already exists?

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

I still need to look into the workflow not being triggered by newer input files, but the lower number of BEDPE files could be explained by the corresponding VCF files being empty. Could you check this?

from lcr-modules.

lkhilton avatar lkhilton commented on August 30, 2024

You are correct, there are five empty vcfs.

Now I see that you're removing empty VCFs from the bedpe targets. Is it possible to output a warning message about that?

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

Absolutely, I can add a message. I might remove the checkpoint altogether because it seems to create more hassle than it's worth. I agree with Ryan: I suspect that it might account for the lack of re-runs when the upstream data is updated.

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

@lkhilton: Did you re-run the module yet (presumably by deleting the output files)? If not, could you try deleting the "sentinel files" from the _manta_all_dispatch rule? They're hidden (start with a period) in the 99-outputs/bedpe subdirectory.

99-outputs/bedpe/{seq_type}--{genome_build}/.{tumour_id}--{normal_id}--{pair_status}.dispatched

from lcr-modules.

rdmorin avatar rdmorin commented on August 30, 2024

I wonder if it would be better if those files were not hidden. What is the rationale for that besides it making less "clutter" when we ls the directory?

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

It was to minimize clutter, but I made that decision before I added the bedpe and vcf subdirectories in 99-outputs. I've now moved to:

CFG["dirs"]["outputs"] + "dispatched/{seq_type}--{genome_build}/{tumour_id}--{normal_id}--{pair_status}.dispatched"

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

As far as I can tell, I have successfully removed the checkpoint from the manta module. This should simplify everything.

from lcr-modules.

lkhilton avatar lkhilton commented on August 30, 2024

I can confirm that deleting the .dispatch file initiates a re-run of manta on updated bam files.

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

I've updated the _get_manta_files() function to predict the output VCF files based on command-line options provided to Manta instead of waiting to see which ones are created by Manta. This should solve the issue reported here. I'll do a quick test before closing this issue.

https://github.com/LCR-BCCRC/lcr-modules/blob/dev-bgrande/modules/manta/1.0/manta.smk#L215-L252

from lcr-modules.

BrunoGrandePhD avatar BrunoGrandePhD commented on August 30, 2024

I believe this issue is gone now with version 2.0 of the manta module (without the checkpoint). If not, we can reopen this issue.

from lcr-modules.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.