Comments (12)
Hi,
Thanks for brining up this issue. It'll be a bit tricky to debug this without having access to the files. Would it be possible to share the input files so we can try to reproduce this? Thanks!
from deepvariant.
Could you give me an email address and then I send you link of this chromosome data?
from deepvariant.
Sure thing! You can send me the files at [email protected]
Additionally, Seg faults can sometimes happen from OOMs (running out of memory). Do you have the memory specs of the instance you are running this on? Thanks!
from deepvariant.
from deepvariant.
I also encountered the same problem, may I ask if it has been solved now? How to solve itοΌ
from deepvariant.
@baozg and @yangxin-9 ,
Additionally to sending the bam files, can you please also see if the files are not truncated? You can run the following command to check if the files are OK:
samtools quickcheck -v *.bam > bad_bams.fofn && echo 'all ok' || echo 'some files failed check, see bad_bams.fofn'
from deepvariant.
OK. I'll try that. Thank you for your reply.
from deepvariant.
I have checked my bam file according to the command you gave and it shows that 'all ok'. The error may not be caused by the bam file.
from deepvariant.
After carefully bisecting your BAM file, it looks like the region that throws an error is chr12:7721068-7735636.
Looking at the pileup, there are 5 large (~11k) deletions in that region of 3 different lengths:
One is length 11,843
, two are 11,844
and two are 11,845
. It looks like the trouble comes from attempting to represent and realign those INDEL candidates with 2 reads each. DeepVariant can't actually call deletions that long.
If you set the vsc_min_count_indel to 3, the problem goes away. So adding --make_examples_extra_args=vsc_min_count_indels=3
should fix the issue. If desired, you can run DeepVariant on just that region with --regions=chr12:7721068-7735636
We will work on fixing this on our end as well in our next release.
@yangxin-9 To avoid mixing issues may or may not be related, please create a new issue that shows the command you ran and the output. Also, if possible, please send us the input files used so we can try to reproduce the issue ourselves.
from deepvariant.
Thanks for your careful examination. It's quite common to see this divergent region in outcrossing plants. It mixed with mapping noise and true variants. Is it possible to report this region / reads when realign fails? Or do I need pre exclude this region before DeepVariant calling?
from deepvariant.
Right now DeepVariant does not have the ability to report such a region by itself and skip it. You will need to exclude the problematic regions before running DeepVariant, or use vsc_min_count_indels
to avoid candidate generation in these cases.
from deepvariant.
Thank you so much. Now this sample runs smoothly.
from deepvariant.
Related Issues (20)
- Parallel execution of DeepVariant fails when using CRAM alignment files. HOT 3
- Question about retraining DeepVariant HOT 2
- Don't use nvidia-tensorrt. Instead specify version with tensorrt==8.5.3.1 HOT 4
- Question: How to train using custom tags in BAM as features? HOT 2
- RNA-seq pre-processing HOT 4
- Developers notes / environment for modifying code HOT 4
- Shuffle script not compatible with versions of tensorflow packages? HOT 4
- Singularity import issue and possible solution HOT 2
- How to merge Pacbio gvcfs? HOT 2
- Issue testing custom model HOT 10
- Error about expected cur_seq.size() < Max_READ_LEN HOT 2
- Missing --gpus flag in deeptrio HOT 1
- How to set threshold to filter vcf files HOT 2
- I am confusion about your deep learning architecture. HOT 3
- error in training DeepVariant HOT 13
- Restarting from post process variants step of deeptrio HOT 4
- BrokenPipeErro during postprocess_variants HOT 8
- Question about training HOT 1
- unable to run deepvariant using conda HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepvariant.