Coder Social home page Coder Social logo

Comments (12)

lucasbrambrink avatar lucasbrambrink commented on May 30, 2024

Hi,

Thanks for brining up this issue. It'll be a bit tricky to debug this without having access to the files. Would it be possible to share the input files so we can try to reproduce this? Thanks!

from deepvariant.

baozg avatar baozg commented on May 30, 2024

Could you give me an email address and then I send you link of this chromosome data?

from deepvariant.

lucasbrambrink avatar lucasbrambrink commented on May 30, 2024

Sure thing! You can send me the files at [email protected]

Additionally, Seg faults can sometimes happen from OOMs (running out of memory). Do you have the memory specs of the instance you are running this on? Thanks!

from deepvariant.

baozg avatar baozg commented on May 30, 2024

from deepvariant.

yangxin-9 avatar yangxin-9 commented on May 30, 2024

I also encountered the same problem, may I ask if it has been solved now? How to solve it?

from deepvariant.

kishwarshafin avatar kishwarshafin commented on May 30, 2024

@baozg and @yangxin-9 ,

Additionally to sending the bam files, can you please also see if the files are not truncated? You can run the following command to check if the files are OK:

samtools quickcheck -v *.bam > bad_bams.fofn   && echo 'all ok' || echo 'some files failed check, see bad_bams.fofn'

from deepvariant.

yangxin-9 avatar yangxin-9 commented on May 30, 2024

OK. I'll try that. Thank you for your reply.

from deepvariant.

yangxin-9 avatar yangxin-9 commented on May 30, 2024

I have checked my bam file according to the command you gave and it shows that 'all ok'. The error may not be caused by the bam file.

from deepvariant.

lucasbrambrink avatar lucasbrambrink commented on May 30, 2024

@baozg

After carefully bisecting your BAM file, it looks like the region that throws an error is chr12:7721068-7735636.

Looking at the pileup, there are 5 large (~11k) deletions in that region of 3 different lengths:
image

One is length 11,843, two are 11,844 and two are 11,845. It looks like the trouble comes from attempting to represent and realign those INDEL candidates with 2 reads each. DeepVariant can't actually call deletions that long.

If you set the vsc_min_count_indel to 3, the problem goes away. So adding --make_examples_extra_args=vsc_min_count_indels=3 should fix the issue. If desired, you can run DeepVariant on just that region with --regions=chr12:7721068-7735636

We will work on fixing this on our end as well in our next release.

@yangxin-9 To avoid mixing issues may or may not be related, please create a new issue that shows the command you ran and the output. Also, if possible, please send us the input files used so we can try to reproduce the issue ourselves.

from deepvariant.

baozg avatar baozg commented on May 30, 2024

Thanks for your careful examination. It's quite common to see this divergent region in outcrossing plants. It mixed with mapping noise and true variants. Is it possible to report this region / reads when realign fails? Or do I need pre exclude this region before DeepVariant calling?

from deepvariant.

lucasbrambrink avatar lucasbrambrink commented on May 30, 2024

Right now DeepVariant does not have the ability to report such a region by itself and skip it. You will need to exclude the problematic regions before running DeepVariant, or use vsc_min_count_indels to avoid candidate generation in these cases.

from deepvariant.

baozg avatar baozg commented on May 30, 2024

Thank you so much. Now this sample runs smoothly.

from deepvariant.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.