Comments (5)
Note:this issue's code is the same as #768 in closed issue
from deepvariant.
Hi @Luosanmu,
For your filtering question. We don't directly calculate many of the statistics in the INFO field for variant calling. We observe that the most effective way to adjust filters is to use the Genotype Quality (GQ) property or the Phred Likelihood (PL) fields (GQ is mathematically derived from PL). Do you want to increase sensitivity? The best way is likely to post-process the VCF to extract the low-confidence REF calls using this field.
For the question - why does DeepVariant make a call that differs from GATK. For any single call, it's difficult to say the exact reasons. Sometimes, looking at the reads and the reference in the region can give clues about why a call would be made. If you have an IGV screenshot showing the region it might be informative. DeepVariant does seem very confident that there isn't a variant here.
It would also be helpful to know something about the sequencing and prep. Is this Illumina data? PacBio data? Is this a PCR-free prep, or does it include PCR? things like that.
Thank you,
Andrew
from deepvariant.
Hi, @AndrewCarroll
chr5-147499874-G-GA with IGV pictures are here
And this sample is Illumina NGS data.
Thanks,
Luosanmu
from deepvariant.
Thank you @Luosanmu
I see. From your image, I think I understand why DeepVariant would make a REF call here. The variant in question is a 1-bp extension of a homopolymer (10A -> 11A). Homopolymers are generally difficult to sequence through. The number of reference-supporting reads are 47 and alternate-supporting reads are 10 (~16%), which is far from the typically-expected 50% if the position is heterozygous.
DeepVariant's model has to weigh which probability is more likely: that this is a real HET event and the random sampling of the alleles causes the observations to be skewed as far as 16%, or is there a sufficiently recurring 1bp insertion error during sequencing that explains these insertions at this ratio.
Presumably, over the bulk of DeepVariant's training, when it has seen similar situations, in more cases these are insertion errors. Now, whether that is what is truly going on in your sample, it's difficult for me as a human to say.
from deepvariant.
Hi @Luosanmu ,
Due to inactivity on this issue, I'll close it. Please feel free to follow up if you have more questions.
from deepvariant.
Related Issues (20)
- Fatal Python error: Segmentation fault HOT 3
- How to get list of variants after make_examples step? HOT 1
- Highest mapping quality = 42 in bowtie2 HOT 3
- Output files are missing after running deepvariant. HOT 10
- Merging gvcf with GLnexus introduces non-zero heterozygous PL in hemizygous PAR HOT 1
- Dynamic cast failed HOT 6
- question for INDEL variant calling HOT 14
- Question about the time it takes for VC analysis HOT 5
- Merging vcf files error with glnexus:v1.2.7 HOT 6
- haploid contigs and PAR region options for DeepTrio HOT 13
- [E::vcf_parse_format] Incorrect number of FORMAT fields at NC_059157.1:24900 HOT 2
- postprocess_variants: Found multiple file patterns in input filename space HOT 7
- Issues with Incompatible TensorRT libraries in docker image google/deepvariant:latest-gpu and google/deepvariant:1.6.1-gpu HOT 9
- CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected HOT 7
- Info ONT R10.4.1 data HOT 3
- error while running deepvariant with a bam file with phasing information
- Error while using deepvariant with a bam file that is phased HOT 4
- Homozygous GT value while IGV shows otherwise HOT 8
- Fix male VCF after calling without --haploid_contigs="chrX,chrY" and/or --par_regions_bed parameters HOT 1
- gvcf with true depth and not (only) min_dp HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepvariant.