Comments (4)
Hello @charliechen912ilovbash,
Sorry for replying so late.
It is well known that the repeat sequence would disturb the alignment and report low-accurate breakpoints on the read. SV callers collect the breakpoints on each read to infer SV candidates. There is no doubt that treating the low-accurate breakpoints as SV signatures would produce low-quality SV positions. To overcome this, cuteSV clusters all breakpoint signatures in a relatively small region to generate "consensus" SV breakpoint groups, then divides them into possible SV events through length signatures. After that, report final SV calls and corresponding genotypes. For more details please read our paper here.
I hope this is helpful to you.
Best regards,
Tao
from cutesv.
Hi, Tao
But for the assembly-based SVs calling, did cuteSV
still cluster breakpoints? Since it is only one read in the sam, could it be possible for cuteSV
to report these breakpoints?
from cutesv.
Hello @baozg,
Thanks for pointing this out.
Actually, cuteSV achieves assembly-based SV calling by converting the typical SV callsets to diploid-based SV callsets. That is, cuteSV generated the initial SV callsets first, which applied the cluster approach mentioned above (there is still more than one SV signature somewhere even though only one contig for a haplotype). Then cuteSV resolves the haplotype tags for each SV call to give phasing-genotype.
Tao
from cutesv.
Hi, Tao
But for an inbreeding plant or haploid cell lines in humans, like A.thaliana or CHM13. It only have one haplotype, did this also need a clustering step.
Besides, as you mentioned, if I want to call variations with cuteSV with population-level assemblies, it would be better to use all the assemblies in one alignment file for this clustering step to refine the breakpoints, right?
Zhigui
from cutesv.
Related Issues (20)
- Confusing read support for a large deletion HOT 3
- Supporting gzip ref fasta HOT 2
- Structural variation of species from two different genus
- `--min_size` didn't work HOT 1
- [Question] How are POS, LEN and SEQ of insertions determined HOT 2
- Some bases in the βREFβ column are incorrect, and this occurs in INS (insertion) types. HOT 1
- Question about vcf INFO filed HOT 1
- Installation fails HOT 5
- cython is required at build time but not install time HOT 2
- SV with genotype 0/0 HOT 1
- empty pickle for self mapping HOT 1
- cuteSV fails to output the result due to computational resources HOT 1
- Force Calling Error (CuteSV + SURVIVOR) HOT 12
- cutcsv only calls indel HOT 1
- Q: Detection SV by aligning to diploid (not haplotype resolved) genome HOT 2
- Call SV in multiple sample HOT 1
- Missing quality scores QUAL HOT 1
- Does `--batches` affect output? HOT 2
- zero position in vcf HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cutesv.