Comments (5)
from basevar.
hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .
Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.
from basevar.
hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .
Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.
You don't have to delete the samples with more than 1x, BaseVar just get one reads from each position for each sample. I think you are using the developing version of BaseVar which I found there are some small bugs and I need more time to fix them. You'd better use the stable version, which is v0.8.0 release here: https://github.com/ShujiaHuang/basevar/archive/v0.8.0.tar.gz and the usage is the some and no need to set --buffer-size or --max_reads. You can just set --batch-count and I think you can try--batch-count 200 or --batch-count 500 which may fit you 20,000 bamlist better.
And when you use basevar for variants calling, I suggest you to add a sub-population group information for you samples by setting parameter--pop-group sample_subpopulation_group.info
, which would allow you to calculate the allele frequency for each sub-population group, such as: Guangdong population and Beijing people. The format of sample_subpopulation_group.info
is very simple (only two columns) and looks like:
Sample1_id Beijing
Sample2_id Beijing
Sample3_id Beijing
Sample4_id Guangdong
Sample5_id Guangdong
Sample6_id Henan
Sample7_id Henan
...
I hope this version suit for you.
from basevar.
uit for you.
Hi,shujia,thanks a lot for you help. I will try this version and send back the reply.
from basevar.
hi, thank you for using BaseVar, I’ll check this error and get back to you ASAP~ And I want to make sure that you're using BaseVar to call variants for NIPT data, right? BaseVar is just suited for supper low-depth (<1x) sequencing data. Best On Aug 5, 2020, at 9:57 AM, liuhankui [email protected] wrote: Hi, I have two errors in using the software for calling variants of 20,000 bams. The first one is, sometimes it reminds me can not open the index of bam; but I am sure the bam and the corresponding index are all good. The second one is, it reminds me too many reads in some region, and suggests me to reduce --buffer-size or increase --max_reads. Once I increase the max reads to 5000000000, the software still can not work. could you please help me to explain how these errors happen and give me some advisement? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#9>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AANUOL6RM275O5OYC6OD4Q3R7C37FANCNFSM4PU7KEJA .
Thank you. Yes, we used basevar for NIPT sequencing data. These are few samples with more than 1X, about 1.5X. I will remove these samples and have another try. I set --batch-count to a small number, and the software seems to work now; but it still reminds me can not open some bams or some problem with the index.
You don't have to delete the samples with more than 1x, BaseVar just get one reads from each position for each sample. I think you are using the developing version of BaseVar which I found there are some small bugs and I need more time to fix them. You'd better use the stable version, which is v0.8.0 release here: https://github.com/ShujiaHuang/basevar/archive/v0.8.0.tar.gz and the usage is the some and no need to set --buffer-size or --max_reads. You can just set --batch-count and I think you can try--batch-count 200 or --batch-count 500 which may fit you 20,000 bamlist better.
And when you use basevar for variants calling, I suggest you to add a sub-population group information for you samples by setting parameter
--pop-group sample_subpopulation_group.info
, which would allow you to calculate the allele frequency for each sub-population group, such as: Guangdong population and Beijing people. The format ofsample_subpopulation_group.info
is very simple (only two columns) and looks like:Sample1_id Beijing Sample2_id Beijing Sample3_id Beijing Sample4_id Guangdong Sample5_id Guangdong Sample6_id Henan Sample7_id Henan ...
I hope this version suit for you.
The version, 0.8.0, it works. It is great.
from basevar.
Related Issues (3)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from basevar.